Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Document Remote Code Execution risk for Dataframe.query and computation.eval #58697

Merged
merged 8 commits into from
May 22, 2024

Conversation

r0rshark
Copy link
Contributor

TLDR

This diff adds documentation to the Remote Code Execution risk for Dataframe.query and computation.eval with the same wording used in Dataframe.eval

Details

The Dataframe.query function is internally calling Dataframe.eval here passing the expr parameter.
Dataframe.eval is correctly warning in the documentation that this could lead to Code execution risk.
This is possible because using the @ syntax we can access local variables and by referencing the imported pandas library we can get the os.system function thanks to os being imported inside compat/init.py. In this way passing something like @pd.compat.os.system('whoami') to the expr parameter of Dataframe.query we can achieve code execution.

However in Dataframe.query documentation we do not surface this risk we only point out that pandas.eval is called but also there we do not warn about any code execution risks.

The vulnerable code I found and that also surprised the developer who wrote it of the code execution risk can be summarized in UserControlled input flowing into the DataFrame.query expr value similarly to

import pandas as pd

def api_whatever(request: HTTPRequest):
  filter = request.GET["filter"]
  data = {
    "test": [125, 230, 412],
  }
  df = pd.DataFrame(data)
  df.query(filter)
   ... 

Calling this api with data:
https://www.example.com?filter=@pd.compat.os.system('whoami')
will allow code execution.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Comment on lines 4476 to 4478
This function pass the `expr` parameter to :meth:`~pandas.DataFrame.eval`.
This allows `eval` to run arbitrary code, which can make you vulnerable to code
injection if you pass user input to this function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This function pass the `expr` parameter to :meth:`~pandas.DataFrame.eval`.
This allows `eval` to run arbitrary code, which can make you vulnerable to code
injection if you pass user input to this function.
This method can run arbitrary code, which can make you vulnerable to code
injection if you pass user input to this function.

I don't think we need to mention the implementation details; it's sufficient to warn that query can run arbitrary code.

Comment on lines 196 to 197
This allows `eval` to run arbitrary code, which can make you vulnerable to code
injection if you pass user input to this function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an internal method, right? In such case, I don't think this is necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - this is exposed at the top level. Agree with including it, but I don't think starting with This is accurate - it
reads to me like that refers to the previous sentence. Can just say

eval can run arbitrary code, ...

@rhshadrach rhshadrach added Docs expressions pd.eval, query labels May 21, 2024
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rhshadrach rhshadrach added this to the 3.0 milestone May 22, 2024
@rhshadrach rhshadrach merged commit b5b2d38 into pandas-dev:main May 22, 2024
47 checks passed
@rhshadrach
Copy link
Member

Thanks @r0rshark!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs expressions pd.eval, query
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants