[security] fix(files): contain stored upload paths#92
Merged
mbakgun merged 1 commit intoMay 10, 2026
Conversation
Contributor
|
Thanks ! 🙏 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes a file-storage path traversal issue in the manual upload flow. The upload endpoint accepted the client-controlled multipart filename and passed it directly into the stored path under the configured file-storage root.
The patch now rejects filenames with path components, validates resolved paths before disk access, converts invalid upload filenames into a 400 response, and adds focused regression coverage for POSIX traversal, Windows-style separators, valid uploads, and legacy escaped storage paths.
Security issues covered
POST /api/files/uploadand shared file-storage helpersBefore this PR
upload_file()passedUploadFile.filenamedirectly tostore_file().store_file()builtrelative_path = f"{owner_id}/{file_uuid}/{filename}"without rejecting..,/,\\, drive prefixes, or NUL bytes._storage_root() / relative_pathwithout checking that the resolved destination stayed under the storage root.get_file_path()also trustedGeneratedFile.storage_path, so legacy or malicious stored paths could resolve outside the storage root.After this PR
./.., and NUL bytes are rejected.400 Bad Requestfor invalid filenames instead of allowing the error to become an internal server error.Why this matters
A filename in a multipart upload is attacker-controlled input. When it is appended into a filesystem path without normalization or containment checks, an authenticated user can write arbitrary bytes outside the intended storage directory, limited by the backend process permissions.
Depending on deployment layout and writable paths, that can corrupt application data, overwrite app-owned files, poison served content, or become code execution if an executed or reloaded path is writable.
Attack flow
POST /api/files/uploadwith a multipart filename such as../../../poc-owned-by-attacker.txt.{storage_root}/{owner_id}/{file_uuid}/.Affected code
backend/app/api/files.pyupload_file()backend/app/services/file_storage.pystore_file()get_file_path()delete_file()throughget_file_path()Root cause
CVSS assessment
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:H/A:LSafe reproduction steps
On vulnerable code, the issue can be reproduced with a local harness that uses the same storage path construction:
../../../poc-owned-by-attacker.txt.This PR adds regression tests that exercise this behavior safely without writing outside the temporary test directory.
Expected vulnerable behavior
On vulnerable code, a filename containing traversal segments resolves outside the configured file-storage root and writes attacker-controlled content there.
Changes in this PR
store_file()andget_file_path().400 Bad Requestfor invalid manual upload filenames.Files changed
backend/app/api/files.pybackend/app/services/file_storage.pybackend/tests/test_file_storage.pyMaintainer impact
This keeps existing safe filenames working while refusing filenames that behave as paths. Stored files still live under the same
{owner_id}/{file_uuid}/{filename}layout for valid filenames.The main behavior change is that clients attempting to upload path-like filenames now receive a client error instead of having those names accepted into storage paths.
Fix rationale
The storage boundary should not depend on callers remembering to sanitize filenames. The storage layer is the last common point before disk access, so it now enforces both input shape and resolved-path containment.
The containment check also protects reads/deletes through
get_file_path()if older rows or future bugs introduce unsafestorage_pathvalues.Type of change
Test plan
Commands run:
cd backend uv run pytest tests/test_file_storage.py uv run ruff check app/services/file_storage.py app/api/files.py tests/test_file_storage.py uv run ruff format --check app/services/file_storage.py app/api/files.py tests/test_file_storage.pyResult:
tests/test_file_storage.py: 4 passedNote:
./check.shcould not complete in this local environment becausebunis not installed (./check.sh: line 10: bun: command not found).Disclosure notes
This PR is intentionally bounded to file-storage path containment for generated/manual-uploaded files. It does not claim broader sandboxing or protection for unrelated filesystem access paths.