Skip to content

Potential false positive for "Uncontrolled data used in path expression" alert #17226

Open
@tieneupin

Description

@tieneupin

Description of the false positive

I'm writing functions to add files to an SQL database, and CodeQL has flagged that the file paths are a potential security risk.

I have constructed a basic validator to make sure that only file paths that exist and start with a given base path when resolved are accepted for subsequent processing. However, CodeQL still views this as being insufficient.

Is this a false positive, or can my validation check be further enhanced? Note that I use resolve so that I can get and compare the start of the file paths.

Code samples or links to source code

# Python version: 3.9.19
# OS: GNU/Linux RHEL8 4.18.0-553.5.1.el8_10.x86_64
from pathlib import Path

# The storage path is defined internally by an environment variable; here is a placeholder
storage_path = Path("/path/to/where/files/are/stored")

# Define a function to validate the file path provided
def validate_file(file: Path) -> bool:
    file = Path(file) if isinstance(file, str) else file  # Pre-empt accidental string inputs
    file = file.resolve()  # Get full path for validation

    # Fail if file doesn't exist
    if not file.exists():
        return False

    # Use path to storage location as reference
    basepath = list(storage_path.parents)[-2]  # This can be made stricter eventually
    if str(file).startswith(str(basepath)):
        return True
    else:
        return False

# How it's used in the script
incoming_file = Path("some_file.txt")  # Can be partial path, or full path on system

if validate_file(incoming_file) is True:
    """
    Run code here to store the file in the database
    """
    return True
else:
    raise Exception("This file failed the validation check")

URL to the alert on GitHub code scanning (optional)
https://github.com/DiamondLightSource/python-murfey/pull/321/checks?check_run_id=28770940998

If this isn't a false positive, and CodeQL is working as intended, advice on mitigating the security issue in this context would be much appreciated. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions