In [1]:
"""
1. Wikipedia에서 검색
2. DuckDuckGo에서 검색
3. 웹사이트의 텍스트를 스크랩하고 추출합니다.
4. 리서치 결과를 .txt 파일에 저장하기
다음 쿼리로 에이전트를 실행합니다: "Research about the XZ backdoor" 라는 쿼리로 에이전트를 실행하면, 에이전트는 Wikipedia 또는 DuckDuckGo에서 검색을 시도하고, DuckDuckGo에서 웹사이트를 찾으면 해당 웹사이트에 들어가서 콘텐츠를 추출한 다음 .txt 파일에 조사 내용을 저장하는 것으로 완료해야 합니다.
"""

from typing import Type
from langchain.agents import initialize_agent, AgentType
from langchain.chat_models import ChatOpenAI
from langchain.tools import WikipediaQueryRun, BaseTool, DuckDuckGoSearchResults
from langchain.utilities import WikipediaAPIWrapper
from langchain.schema import SystemMessage
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import WebBaseLoader
from pydantic import BaseModel, Field
import bs4


class SearchToolArgsSchema(BaseModel):
    keyword: str = Field(description="The keyword you will search for.")


class CrawlToolArgsSchema(BaseModel):
    url: str = Field(description="The url you will crawl for.")


class WriteFileToolArgsSchema(BaseModel):
    information: str = Field(description="The information wroten into txt file.")


class WikipediaSearchTool(BaseTool):
    name = "WikipediaSearch"
    description = "Use this tool to find information about keyword in Wikipedia. It takes a query as an argument."
    args_schema: Type[BaseModel] = SearchToolArgsSchema

    def _run(self, keyword):
        wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
        return wikipedia.run(keyword)


class DuckDuckGoSearchTool(BaseTool):
    name = "DuckDuckGoSearch"
    description = "Use this tool to find sites about keyword in DuckDuckGo.It takes a query as an argument."
    args_schema: Type[BaseModel] = SearchToolArgsSchema

    def _run(self, keyword):
        ddg = DuckDuckGoSearchResults()
        return ddg.run(keyword)


class WebsiteCrawlTool(BaseTool):
    name = "WebsiteCrawl"
    description = "Use this tool to find information in sites. It takes a address of site as an argument."
    args_schema: Type[BaseModel] = CrawlToolArgsSchema

    def _run(self, url):
        loader = WebBaseLoader(url,
                               bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("content")
        )
    ),)
        docs = loader.load()

        llm = ChatOpenAI(
            temperature=0,
            model_name="gpt-4o-mini",
        )
        chain = load_summarize_chain(llm, chain_type="stuff")

        return chain.run(docs)


class WriteFileTool(BaseTool):
    name = "WriteFile"
    description = "Use this tool to write information into result.txt file."
    arg_schema: Type[BaseModel] = WriteFileToolArgsSchema

    def _run(self, information):
        filename="result.txt"
        f = open(filename, "w", encoding="utf-8")
        f.write(information)
        f.close()
        return filename


llm = ChatOpenAI(
    temperature=1e-1,
    model_name="gpt-4o-mini",
)


agent = initialize_agent(
    llm=llm,
    verbose=True,
    agent=AgentType.OPENAI_FUNCTIONS,
    handle_parsing_errors=True,
    tools=[
        WikipediaSearchTool(),
        DuckDuckGoSearchTool(),
        WebsiteCrawlTool(),
        WriteFileTool(),
    ],
    agent_kwargs={
        "system_message": SystemMessage(
            content="""
            You are a information collector.
            
            You find a information about keyword.

            You use Wikipedia or DuckDuckGo.

            if you used DuckDuckGo, crawl informations from each websites.
            
            finally, these information write into result.txt.
            """
        )
    },
)

agent.invoke("Research about the XZ backdoor")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `WikipediaSearch` with `{'keyword': 'XZ backdoor'}`


[0m[36;1m[1;3mPage: XZ Utils backdoor
Summary: In February 2024, a malicious backdoor was introduced to the Linux build of the xz utility within the liblzma library in versions 5.6.0 and 5.6.1 by an account using the name "Jia Tan". The backdoor gives an attacker who possesses a specific Ed448 private key remote code execution capabilities on the affected Linux system. The issue has been given the Common Vulnerabilities and Exposures number CVE-2024-3094 and has been assigned a CVSS score of 10.0, the highest possible score.
While xz is commonly present in most Linux distributions, at the time of discovery the backdoored version had not yet been widely deployed to production systems, but was present in development versions of major distributions. The backdoor was discovered by the software developer Andres Freund, who announced his findings on 29 March 2024.


  results = ddgs.text(


[33;1m[1;3m[snippet: What Does the Backdoor Do? Malicious code added to xz Utils versions 5.6.0 and 5.6.1 modified the way the software functions. The backdoor manipulated sshd, the executable file used to make remote ..., title: The XZ Backdoor: Everything You Need to Know - WIRED, link: https://www.wired.com/story/xz-backdoor-everything-you-need-to-know/], [snippet: Because the backdoor was discovered before the malicious versions of xz Utils were added to production versions of Linux, "it's not really affecting anyone in the real world," Will Dormann, a ..., title: Backdoor found in widely used Linux utility targets encrypted SSH ..., link: https://arstechnica.com/security/2024/03/backdoor-found-in-widely-used-linux-utility-breaks-encrypted-ssh-connections/], [snippet: Now that the XZ backdoor has control over sshd, attackers can possess the encryption key used to make the SSH connections hide their malicious code in the SSH login certificate and use it as an entry point to execut

{'input': 'Research about the XZ backdoor',
 'output': 'I have gathered information about the XZ backdoor and compiled it into a file named result.txt. Here are the key points included in the document:\n\n### XZ Backdoor Research\n\n#### Overview\nIn February 2024, a malicious backdoor was introduced to the Linux build of the xz utility within the liblzma library in versions 5.6.0 and 5.6.1 by an account using the name "Jia Tan". The backdoor allows an attacker with a specific Ed448 private key to execute remote code on affected Linux systems. This vulnerability has been assigned the CVE number CVE-2024-3094 and has a CVSS score of 10.0, the highest possible score.\n\n#### Discovery\nThe backdoor was discovered by software developer Andres Freund on March 29, 2024. It was found in development versions of major Linux distributions but had not yet been widely deployed in production systems.\n\n#### Functionality\nThe backdoor manipulates the sshd executable, which is used for remote logi

In [2]:
!cat result.txt

# XZ Backdoor Research

## Overview
In February 2024, a malicious backdoor was introduced to the Linux build of the xz utility within the liblzma library in versions 5.6.0 and 5.6.1 by an account using the name "Jia Tan". The backdoor allows an attacker with a specific Ed448 private key to execute remote code on affected Linux systems. This vulnerability has been assigned the CVE number CVE-2024-3094 and has a CVSS score of 10.0, the highest possible score.

## Discovery
The backdoor was discovered by software developer Andres Freund on March 29, 2024. It was found in development versions of major Linux distributions but had not yet been widely deployed in production systems.

## Functionality
The backdoor manipulates the sshd executable, which is used for remote logins, allowing attackers to hijack SSH connections. This could enable unauthorized access to systems, potentially leading to data theft or the installation of malware.

## Impact
The malicious code was introduced through soc

In [3]:
agent.invoke("Research about the XZ backdoor in DuckDuckGo")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `DuckDuckGoSearch` with `{'keyword': 'XZ backdoor'}`


[0m

  results = ddgs.text(


[33;1m[1;3m[snippet: What Does the Backdoor Do? Malicious code added to xz Utils versions 5.6.0 and 5.6.1 modified the way the software functions. The backdoor manipulated sshd, the executable file used to make remote ..., title: The XZ Backdoor: Everything You Need to Know - WIRED, link: https://www.wired.com/story/xz-backdoor-everything-you-need-to-know/], [snippet: Now that the XZ backdoor has control over sshd, attackers can possess the encryption key used to make the SSH connections hide their malicious code in the SSH login certificate and use it as an entry point to execute various malicious actions on the compromised device - such as stealing files or installing malware (e.g., ransomware ..., title: The XZ Backdoor explained - Cybernews, link: https://cybernews.com/editorial/xz-linux-backdoor-explained/], [snippet: Because the backdoor was discovered before the malicious versions of xz Utils were added to production versions of Linux, "it's not really affecting anyone in the 

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit of 15 per 60s exceeded for UserByModelByMinute. Please wait 1 seconds before retrying..


[32;1m[1;3m
Invoking: `WebsiteCrawl` with `{'url': 'https://cybernews.com/editorial/xz-linux-backdoor-explained/'}`


[0m[38;5;200m[1;3mYou have been blocked from accessing cybernews.com due to a security measure by Cloudflare, which protects the site from online attacks. This block may have been triggered by specific actions you took. To resolve the issue, you can contact the site owner, providing details of your actions and the Cloudflare Ray ID displayed on the page.[0m[32;1m[1;3m
Invoking: `WebsiteCrawl` with `{'url': 'https://arstechnica.com/security/2024/03/backdoor-found-in-widely-used-linux-utility-breaks-encrypted-ssh-connections/'}`


[0m[38;5;200m[1;3mA backdoor has been discovered in the xz Utils compression tool, affecting versions 5.6.0 and 5.6.1, which are used in various Linux distributions, including Red Hat and Debian. The malicious code, introduced in February 2024, targets SSH connections, allowing unauthorized access to systems by injecting code during t

{'input': 'Research about the XZ backdoor in DuckDuckGo',
 'output': 'The information about the XZ backdoor has been successfully collected and written into result.txt. If you need any further assistance or additional information, feel free to ask!'}

In [4]:
!cat result.txt

# XZ Backdoor Overview

The XZ Backdoor is a significant supply chain attack that targeted XZ Utils, a widely used open-source data compression utility in Linux and Unix-like systems. Discovered by a Microsoft developer, the backdoor was intentionally planted in versions 5.6.0 and 5.6.1, allowing attackers with a specific encryption key to execute arbitrary code via SSH connections. The attack was sophisticated, involving social engineering and manipulation of the software's development process over several years. The malicious code could have enabled extensive control over affected systems, potentially surpassing the impact of the SolarWinds incident. Researchers are currently analyzing the backdoor's mechanisms and its implications for security in open-source software. A CVE designation has been assigned (CVE-2024-3094), and tools are available to detect its presence on affected devices.

## Technical Details

A backdoor has been discovered in the xz Utils compression tool, affecting