Issue: security concerns with `exec()` via multiple agents and Shell tool #5294

juppytt · 2023-05-26T11:38:23Z

Issue you'd like to raise.

TL;DR: The use of exec() in agents can lead to remote code execution vulnerabilities. Some Huggingface projects use such agents, despite the potential harm of LLM-generated Python code.

#1026 and #814 discuss the security concerns regarding the use of exec() in llm_math chain. The comments in #1026 proposed methods to sandbox the code execution, but due to environmental issues, the code was patched to replace exec() with numexpr.evaluate() (#2943). This restricted the execution capabilities to mathematical functionalities only. This bug was assigned the CVE number CVE-2023-29374.

As shown in the above issues, the usage of exec() in a chain can pose a significant security risk, especially when the chain is running on a remote machine. This seems common scenario for projects in Huggingface.

However, in the latest langchain, exec() is still used in PythonReplTool and PythonAstReplTool.
https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/langchain/tools/python/tool.py#L55

https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/langchain/tools/python/tool.py#L102

These functions are called by Pandas Dataframe Agent, Spark Dataframe Agent, CSV Agent. It seems they are intentionally designed to pass the LLM output to PythonAstTool or PythonAstReplTool to execute the LLM-generated code in the machine.

The documentation for these agents explicitly states that they should be used with caution since LLM-generated Python code can be potentially harmful. For instance:
https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/docs/modules/agents/toolkits/examples/pandas.ipynb#L12

Despite this, I have observed several projects in Huggingface using create_pandas_dataframe_agent and create_csv_agent.

Suggestion:

Fixing this issue as done in llm_math chain seems challenging.
Simply restricting the LLM-generated code to Pandas and Spark execution might not be sufficient because there are still numerous malicious tasks that can be performed using those APIs. For instance, Pandas can read and write files.

Meanwhile, it seems crucial to emphasize the security concerns related to LLM-generated code for the overall security of LLM apps. Merely limiting execution to specific frameworks or APIs may not fully address the underlying security risks.

The text was updated successfully, but these errors were encountered:

juppytt · 2023-05-29T06:46:33Z

I have come across a functionality similar to exec() in shell tool available on https://github.com/hwchase17/langchain/blob/master/langchain/tools/shell/tool.py.

Shell tool is designed to execute shell commands generated by the LLM. The documentation for this tool does not provide a warning about its potential safety concerns. (https://python.langchain.com/en/latest/modules/agents/tools/examples/bash.html?highlight=shell)

However, within the code itself, there is a cautionary message stating that it may be unsafe to use the tool.
https://github.com/hwchase17/langchain/blob/99a1e3f3a309852da989af080ba47288dcb9a348/langchain/tools/shell/tool.py#L32-L35

It would be advisable to include a warning in the documentation to alert users about the potential risks involved.

Specifically, when utilizing the shell tool in an LLM application running on a server, it is possible for malicious users to exploit it, gaining control over the shell command execution and potentially running arbitrary code on the server. Similarly, if the LLM application is running on a local machine and receives untrusted data as input, such as a malicious webpage or untrusted files, the generated shell command by the LLM can be manipulated by this untrusted data, enabling the execution of arbitrary code on the local machine. It is crucial to be aware of these security risks when using the shell tool in such scenarios.

mick-net · 2023-06-19T12:13:52Z

Related #2301

dosubot · 2023-09-19T16:05:56Z

Hi, @juppytt! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you raised concerns about the use of exec() in agents, which can lead to remote code execution vulnerabilities. You suggested addressing the security concerns related to LLM-generated code and not solely relying on restricting execution to specific frameworks or APIs. In the comments, you provided an example of a shell tool that also poses potential safety concerns and suggested including a warning in the documentation.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain project!

This was referenced May 27, 2023

Security concerns #1026

Closed

RFC: Use wasm-exec package to sandbox code execution by the Python REPL tool #5388

Closed

juppytt changed the title ~~Issue: security concerns with exec() via Pandas Dataframe Agent, Spark Dataframe Agent, CSV Agent~~ Issue: security concerns with exec() via multiple agents and Shell tools May 29, 2023

juppytt changed the title ~~Issue: security concerns with exec() via multiple agents and Shell tools~~ Issue: security concerns with exec() via multiple agents and Shell tool May 29, 2023

Jflick58 mentioned this issue Jun 2, 2023

Replace exec with wasm_exec #5640

Closed

sfc-gh-jcarroll mentioned this issue Aug 24, 2023

Add warnings on chat_pandas_df langchain-ai/streamlit-agent#30

Merged

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 19, 2023

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 26, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Sep 26, 2023

eyurtsev added the 🤖:security Related to security issues, CVEs label Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue: security concerns with `exec()` via multiple agents and Shell tool #5294

Issue: security concerns with `exec()` via multiple agents and Shell tool #5294

juppytt commented May 26, 2023 •

edited

Loading

juppytt commented May 29, 2023

mick-net commented Jun 19, 2023

dosubot bot commented Sep 19, 2023

Issue: security concerns with exec() via multiple agents and Shell tool #5294

Issue: security concerns with exec() via multiple agents and Shell tool #5294

Comments

juppytt commented May 26, 2023 • edited Loading

Issue you'd like to raise.

Suggestion:

juppytt commented May 29, 2023

mick-net commented Jun 19, 2023

dosubot bot commented Sep 19, 2023

Issue: security concerns with `exec()` via multiple agents and Shell tool #5294

Issue: security concerns with `exec()` via multiple agents and Shell tool #5294

juppytt commented May 26, 2023 •

edited

Loading