Skip to content

Tidy up PDF generation#27

Merged
SimonGurney merged 5 commits intomainfrom
pdf_doc
May 28, 2025
Merged

Tidy up PDF generation#27
SimonGurney merged 5 commits intomainfrom
pdf_doc

Conversation

@psec-lcook
Copy link
Copy Markdown
Contributor

Closes #26

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review Summary

  • High Severity:
    • User input in error messages (saist/util/argparsing.py) is not sanitized, risking information disclosure or command injection. Sanitize inputs or restrict error message exposure to trusted contexts.

Prioritize fixing the input sanitization to prevent potential injection attacks.

Comment thread saist/latex/__init__.py
print(f"✨ Written report to '{pdf_path}'\n")

logger.debug(f"Cleaning up auxiliary files in '{self._DEFAULT_OUTPUT_DIR}'")
subprocess.run(["latexmk", "-c"], cwd=self._DEFAULT_OUTPUT_DIR, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, )
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: Potential command injection vulnerability due to unsanitized input in subprocess.run.

Priority: MEDIUM

CWE: CWE-78

Recommendation: Ensure the cwd parameter is sanitized or hardcoded to prevent command injection.

Snippet: subprocess.run(["latexmk", "-c"], cwd=self._DEFAULT_OUTPUT_DIR, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, )

Comment thread saist/util/argparsing.py
parser.error(
"[Error] Unable to find 'latexmk' binary in $PATH needed for PDF report building, cannot use --pdf flag"
)
parser.error("Unable to find 'latexmk' binary in $PATH needed for PDF report building, cannot use --pdf flag")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: The error message does not sanitize user input, which could lead to information disclosure or command injection if the error message is logged or displayed insecurely.

Priority: LOW

CWE: CWE-209

Recommendation: Sanitize the error message or ensure it is only displayed in a trusted context.

Snippet: parser.error("Unable to find 'latexmk' binary in $PATH needed for PDF report building, cannot use --pdf flag")

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security & Performance Summary:

🔴 High Severity:

  • Command injection risk in saist/latex/__init__.py - sanitize inputs to subprocess.run

🟠 Medium Severity:

  • Poor string handling in saist/main.py - replace loop concatenation with list joining
  • Unclear error messages in saist/util/argparsing.py - add actionable remediation steps

🟡 Low Severity:

  • Potential typo in saist/main.py - verify variable name consistency ('prompt')

Comment thread saist/latex/__init__.py
print(f"✨ Written report to '{pdf_path}'\n")

logger.debug(f"Cleaning up auxiliary files in '{self._DEFAULT_OUTPUT_DIR}'")
subprocess.run(["latexmk", "-c"], cwd=self._DEFAULT_OUTPUT_DIR, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, )
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: Command injection vulnerability due to unsanitized input in subprocess.run.

Priority: HIGH

CWE: CWE-78

Recommendation: Sanitize and validate all inputs used in subprocess.run to prevent command injection.

Snippet: subprocess.run(["latexmk", "-c"], cwd=self._DEFAULT_OUTPUT_DIR, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, )

Comment thread saist/main.py
prompt = ""
for f in findings:
prompt = f"- **File**: `{f.file}`\n - **Issue**: {f.issue}\n - **Recommendation**: {f.recommendation}\n\n"
prompt += f"- **File**: `{f.file}`\n - **Issue**: {f.issue}\n - **Recommendation**: {f.recommendation}\n\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: String concatenation in a loop can lead to performance issues and potential memory exhaustion for large datasets.

Priority: MEDIUM

CWE: CWE-400

Recommendation: Use a list to accumulate strings and join them at the end for better performance.

Snippet: prompt += f"- **File**: {f.file}\n - **Issue**: {f.issue}\n - **Recommendation**: {f.recommendation}\n\n"

Comment thread saist/main.py
Uses OpenAI to generate a summary of all findings to be used as the PR review body.
"""
system_prompt = prompts.SUMMARY
prompt = ""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: Potential typo in variable name 'prompt' which could lead to undefined behavior or errors.

Priority: LOW

CWE: CWE-710

Recommendation: Correct the variable name from 'prompt' to 'prompt' to avoid confusion and potential runtime errors.

Snippet: prompt = ""

Comment thread saist/util/argparsing.py
parser.error(
"[Error] Unable to find 'latexmk' binary in $PATH needed for PDF report building, cannot use --pdf flag"
)
parser.error("Unable to find 'latexmk' binary in $PATH needed for PDF report building, cannot use --pdf flag")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: Improper error message handling. The error message does not provide sufficient context or guidance for remediation.

Priority: LOW

CWE: CWE-209

Recommendation: Enhance the error message to include actionable steps, such as installation instructions for 'latexmk'.

Snippet: parser.error("Unable to find 'latexmk' binary in $PATH needed for PDF report building, cannot use --pdf flag")

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Findings Summary

High Severity Issues

Directory Traversal Vulnerability
The file saist/latex/__init__.py uses os.makedirs with user-controlled input (tex_path), which can lead to directory traversal if the input is not sanitized. Sanitize the input to prevent such attacks.

Cross-Site Scripting (XSS) Risk
In saist/main.py, the 'prompt' variable is constructed by concatenating user-controlled inputs (f.file, f.issue, f.recommendation) without proper sanitization. This could result in XSS if the output is rendered in a web context. Sanitize user-controlled inputs using libraries like html.escape or a template engine with auto-escaping.

Subprocess Command Injection Risk
The same file (saist/latex/__init__.py) uses subprocess.run with hardcoded commands, but the working directory (cwd) could be user-controlled. Ensure the working directory is not influenced by user input or sanitize it before use.

Medium Severity Issues

Input Validation Missing for CWE Field
In saist/models.py, the CWE field lacks validation logic to enforce the required format (CWE-XX or CWE-XXX). This could result in malformed CWE identifiers. Implement input validation to ensure the correct format.

Priority Field Validation Missing
The 'priority' field in saist/models.py is an integer without constraints, allowing arbitrary values that may lead to incorrect prioritization. Add validation to restrict the 'priority' field to a defined range (e.g., 1-9).

Information Exposure Risk
The 'context_from_finding' function in saist/main.py retrieves file contents and returns a context around a finding, potentially exposing sensitive data. Ensure sensitive information is filtered or redacted before returning the context.

Grouped Recommendations

  • Sanitize all user-controlled inputs to prevent directory traversal and XSS.
  • Validate input fields (CWE, priority) to enforce correct formats and ranges.
  • Secure subprocess execution by controlling or sanitizing the working directory.
  • Implement data filtering to prevent sensitive information exposure.

Comment thread saist/latex/__init__.py
def _write_tex(self, tex_path: str):
try:
os.makedirs(os.path.dirname(tex_path), exist_ok=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: The use of os.makedirs with user-controlled input (tex_path) can lead to directory traversal if the input is not properly sanitized.

Priority: MEDIUM

CWE: CWE-22

Recommendation: Sanitize the input to prevent directory traversal attacks.

Snippet: os.makedirs(os.path.dirname(tex_path), exist_ok=True)

Comment thread saist/main.py
prompt = ""
for f in findings:
prompt = f"- **File**: `{f.file}`\n - **Issue**: {f.issue}\n - **Recommendation**: {f.recommendation}\n\n"
prompt += f"- **File**: `{f.file}`\n - **Issue**: {f.issue}\n - **Recommendation**: {f.recommendation}\n\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: The 'prompt' variable is constructed by concatenating user-controlled input (f.file, f.issue, f.recommendation) without proper sanitization, which could lead to Cross-Site Scripting (XSS) if the output is rendered in a web context.

Priority: MEDIUM

CWE: CWE-79

Recommendation: Sanitize user-controlled input before including it in the prompt string to prevent XSS. Use a library like 'html.escape' or a template engine that automatically escapes content.

Snippet: prompt += f"- **File**: {f.file}\n - **Issue**: {f.issue}\n - **Recommendation**: {f.recommendation}\n\n"

Comment thread saist/latex/__init__.py
print(f"✨ Written report to '{pdf_path}'\n")

logger.debug(f"Cleaning up auxiliary files in '{self._DEFAULT_OUTPUT_DIR}'")
subprocess.run(["latexmk", "-c"], cwd=self._DEFAULT_OUTPUT_DIR, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, )
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: The use of subprocess.run with hardcoded commands can still be vulnerable if the working directory (cwd) is user-controlled or influenced.

Priority: MEDIUM

CWE: CWE-78

Recommendation: Ensure the working directory is not user-controlled or sanitize it before use.

Snippet: subprocess.run(["latexmk", "-c"], cwd=self._DEFAULT_OUTPUT_DIR, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL, )

Comment thread saist/models.py
issue: str
recommendation: str
cwe: str
cwe: Annotated[str, Field(description= "CWE id, should conform to CWE-XX or CWE-XXX where X is a number") ]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: The CWE field is annotated to require a specific format (CWE-XX or CWE-XXX), but there is no validation logic to enforce this format. This could lead to incorrect or malformed CWE identifiers being stored.

Priority: MEDIUM

CWE: CWE-20

Recommendation: Add input validation to ensure the CWE field conforms to the required format (CWE-XX or CWE-XXX).

Snippet: cwe: Annotated[str, Field(description= "CWE id, should conform to CWE-XX or CWE-XXX where X is a number") ]

Comment thread saist/main.py
for finding in all_findings:
try:
context, start, end = await context_from_finding(scm, finding)
context, start, end = await context_from_finding(scm, finding, 10)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: The 'context_from_finding' function retrieves file contents and returns a context around a finding. If the file contains sensitive data, this could lead to information exposure.

Priority: LOW

CWE: CWE-200

Recommendation: Ensure that sensitive data is not included in the context returned by 'context_from_finding'. Implement filtering or redaction of sensitive information before returning the context.

Snippet: context, start, end = await context_from_finding(scm, finding, 10)

Comment thread saist/models.py
recommendation: str
cwe: str
cwe: Annotated[str, Field(description= "CWE id, should conform to CWE-XX or CWE-XXX where X is a number") ]
priority: int
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issue: The 'priority' field is defined as an integer without any constraints or validation. This could allow arbitrary values, potentially leading to incorrect prioritization of security issues.

Priority: LOW

CWE: CWE-20

Recommendation: Add validation to ensure the 'priority' field is within a defined range (e.g., 1-9).

Snippet: priority: int

@SimonGurney SimonGurney changed the title Add PDF report docs to README Tidy up PDF generation May 28, 2025
@SimonGurney SimonGurney merged commit efc7697 into main May 28, 2025
2 checks passed
@SimonGurney SimonGurney deleted the pdf_doc branch May 28, 2025 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update README and output

2 participants