fix(backend): Add input validation, error handling, and security hardening to server.py#422
fix(backend): Add input validation, error handling, and security hardening to server.py#422zohaib-7035 wants to merge 4 commits into
Conversation
…ening to server.py Problem: - No input validation on any API endpoint (empty text, huge payloads, invalid types) - Missing error handling causes 500 errors with stack traces leaked to clients - /getTranscript endpoint vulnerable to command injection via unsanitized video_id - /getTranscript has race condition when concurrent requests use glob on shared folder - File uploads have no size limits or extension validation - No structured logging (bare print statements) - Hardcoded credential paths with no env variable fallback Solution: - Add _validate_input_text() helper: checks for None, empty, type, and max length (50k chars) - Add _validate_max_questions() helper: clamps to safe range [1, 20] - Add _allowed_file() helper: whitelist txt/pdf/docx extensions - Validate video_id against strict regex pattern (alphanumeric, 11 chars) before subprocess - Add 30s timeout to yt-dlp subprocess call - Scope subtitle files to video_id to prevent race conditions between concurrent requests - Add Flask MAX_CONTENT_LENGTH (10 MB) with 413 error handler - Wrap all endpoints in try/except with user-friendly error messages (no stack traces) - Replace print() with Python logging module throughout - Use os.environ.get() for Google service account file path - Add input validation to generate_gform (qa_pairs and question_type) - Use request.get_json(silent=True) to handle malformed JSON gracefully Co-authored-by: Cursor <cursoragent@cursor.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughCentralized input validation and logging; guarded Google Docs/Forms integration; MediaWiki optional expansion; robust transcript/upload handling with timeouts and scoped file cleanup; added hard-question endpoints and stricter error responses; changed QuestionGenerator.generate to accept an int | None for num_questions. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Server as Server
participant QAModel as QA_Model
participant GoogleDocs as Google_Docs
participant MediaWiki as MediaWiki
participant YouTube as YouTube
Client->>Server: POST /get_mcq (input_text, max_questions, options)
alt use_mediawiki or hard variant
Server->>MediaWiki: expand(article)
MediaWiki-->>Server: expanded_text
end
alt use_google_docs
Server->>GoogleDocs: fetch(document_url)
GoogleDocs-->>Server: document_text / 503 error
end
alt transcript requested
Server->>YouTube: fetch/transcribe(video_id) with timeout
YouTube-->>Server: transcript or 4xx/5xx
end
Server->>QAModel: generate(article_text, num_questions, use_evaluator)
QAModel-->>Server: questions (+answers)
Server-->>Client: 200 JSON (questions, answers) or 4xx/5xx
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/server.py (2)
355-527:⚠️ Potential issue | 🟠 Major
generate_gformis missing try/except wrapping, andwebbrowser.open_new_tabis a server-side anti-pattern.While all other endpoints were hardened with try/except + generic error messages,
generate_gformhas no exception handling around the Google Forms API calls (lines 518–527). A failure here will leak a full stack trace to the client.Additionally,
webbrowser.open_new_tab(...)on line 524 attempts to open a browser on the server machine, which is meaningless (and will fail silently or error) in any deployed environment. The edit URL should be returned to the client in the response instead.Finally,
jsonify(result["responderUri"])on line 523 wraps a plain string, producing a bare JSON string response rather than a structured object — inconsistent with every other endpoint.Suggested fix (lines 518–527)
- result = form_service.forms().create(body=NEW_FORM).execute() - form_service.forms().batchUpdate( - formId=result["formId"], body=NEW_QUESTION - ).execute() - - edit_url = jsonify(result["responderUri"]) - webbrowser.open_new_tab( - "https://docs.google.com/forms/d/" + result["formId"] + "/edit" - ) - return edit_url + try: + result = form_service.forms().create(body=NEW_FORM).execute() + form_service.forms().batchUpdate( + formId=result["formId"], body=NEW_QUESTION + ).execute() + + return jsonify({ + "responder_url": result["responderUri"], + "edit_url": f"https://docs.google.com/forms/d/{result['formId']}/edit", + }) + except Exception: + logger.exception("Error creating Google Form") + return jsonify({"error": "Failed to create Google Form"}), 500
366-381:⚠️ Potential issue | 🟠 MajorOAuth interactive flow (
tools.run_flow) will not work on a headless server.
tools.run_flow(flow, store)on line 373 launches an interactive browser-based OAuth consent flow. This is designed for local CLI tools, not for a web server handling API requests. On a deployed server, this will either hang or crash.This is pre-existing code, but since the PR is hardening this endpoint (adding input validation on lines 357–365), it would be worth addressing — or at minimum wrapping in a try/except and returning an appropriate error.
🤖 Fix all issues with AI agents
In `@backend/server.py`:
- Around line 680-725: The code can raise a NameError if
clean_transcript(subtitle_file) fails because transcript_text is referenced
after the finally cleanup; also expected_vtt is unused. Wrap the call to
clean_transcript in a try/except that catches Exception, logs the error via
logger (include video_id and exception), returns an appropriate error response
(e.g., 500) and still cleans up files in the finally block; remove the unused
expected_vtt assignment entirely. Specifically, modify the block that calls
clean_transcript(subtitle_file) so transcript_text is only referenced after a
successful call (or set a safe default only after success), handle exceptions
from clean_transcript with logger.error and jsonify error response, and keep the
existing cleanup loop over matching_files.
- Around line 609-636: The uploaded filename isn't sanitized which allows path
traversal despite _allowed_file checking extensions; update the upload handling
to sanitize filenames (use werkzeug.utils.secure_filename) before passing to
FileProcessor.process_file or, better, apply secure_filename inside
FileProcessor.process_file itself for defense-in-depth; specifically, ensure
upload_file uses secure_filename on request.files["file"].filename (and/or
modify FileProcessor.process_file to call secure_filename on file.filename
before any os.path.join with self.upload_folder) and keep references to
file_processor, upload_file, _allowed_file, and FileProcessor.process_file when
making the change.
- Around line 541-554: The bug is passing a list (input_questions) as
num_questions to qg.generate and mis-typing the generate() signature; ensure
num_questions is an integer before calling qg.generate and fix the type
annotation. Convert/coerce the request value (input_questions) to an int (e.g.,
parse int if a string, or use len(...) if a list was passed intentionally) and
pass that integer to qg.generate(article=..., num_questions=that_int, ...);
update every call site that currently passes input_questions (the calls to
qg.generate at the places noted and the other two occurrences) and change the
generate(...) signature from num_questions: bool = None to num_questions: int =
None and any internal assumptions in _get_ranked_qa_pairs to use the integer
type.
🧹 Nitpick comments (3)
backend/server.py (3)
706-711: Glob pattern is slightly broader than intended.
glob.glob(f"subtitles/{video_id}*.vtt")would also match files whose names start with the same 11 characters but have additional characters before.vtt(e.g., a hypothetical{video_id}xyz.vtt). Sinceyt-dlpoutputs files like{video_id}.en.vtt, a more precise pattern would be:- matching_files = glob.glob(f"subtitles/{video_id}*.vtt") + matching_files = glob.glob(f"subtitles/{video_id}.*.vtt")This is low risk given the validated 11-char ID, but tighter is better for correctness.
131-152: Unused exception variableeacross all endpoints.Since
logger.exception(...)automatically captures and logs the exception info, theas ebinding is unused in everyexcept Exception as eclause. Ruff flags this (F841) on lines 150, 174, 198, 231, 276, 301, 328, 350, 552, 579, 605, 634.This is a simple cleanup:
- except Exception as e: + except Exception:Also applies to: 155-176, 179-200
246-258: Consider addingstrict=Truetozip()for defensive validation.Line 249 already checks
len(input_questions) != len(input_options), so the lists should always be equal length when reaching line 255. Addingstrict=Truetozip()would serve as a safety net and satisfies the Ruff B905 lint:- for question, options in zip(input_questions, input_options): + for question, options in zip(input_questions, input_options, strict=True):
| @app.route("/upload", methods=["POST"]) | ||
| def upload_file(): | ||
| if 'file' not in request.files: | ||
| return jsonify({"error": "No file part"}), 400 | ||
| if "file" not in request.files: | ||
| return jsonify({"error": "No file part in the request"}), 400 | ||
|
|
||
| file = request.files['file'] | ||
| uploaded_file = request.files["file"] | ||
|
|
||
| if file.filename == '': | ||
| return jsonify({"error": "No selected file"}), 400 | ||
| if uploaded_file.filename == "": | ||
| return jsonify({"error": "No file selected"}), 400 | ||
|
|
||
| content = file_processor.process_file(file) | ||
|
|
||
| if content: | ||
| return jsonify({"content": content}) | ||
| else: | ||
| return jsonify({"error": "Unsupported file type or error processing file"}), 400 | ||
| if not _allowed_file(uploaded_file.filename): | ||
| return ( | ||
| jsonify({ | ||
| "error": f"Unsupported file type. Allowed types: {', '.join(ALLOWED_UPLOAD_EXTENSIONS)}" | ||
| }), | ||
| 400, | ||
| ) | ||
|
|
||
| try: | ||
| content = file_processor.process_file(uploaded_file) | ||
| if content: | ||
| logger.info("Processed uploaded file: %s", uploaded_file.filename) | ||
| return jsonify({"content": content}) | ||
| else: | ||
| return jsonify({"error": "Could not extract content from the file"}), 400 | ||
| except Exception as e: | ||
| logger.exception("Error processing uploaded file") | ||
| return jsonify({"error": "Failed to process the uploaded file"}), 500 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for secure_filename usage
rg -n 'secure_filename' --type=py
# Also search for FileProcessor class and process_file method
echo "=== FileProcessor.process_file ==="
rg -n 'class FileProcessor' --type=py -A 20
# Check backend/server.py to verify the upload handler
echo "=== Checking backend/server.py line count ==="
wc -l backend/server.py
# Check imports in backend/server.py
echo "=== Imports in backend/server.py ==="
head -50 backend/server.pyRepository: AOSSIE-Org/EduAid
Length of output: 3124
🏁 Script executed:
#!/bin/bash
# Verify lines 609-636 in backend/server.py
echo "=== backend/server.py lines 609-636 ==="
sed -n '609,636p' backend/server.py
# Check full process_file method in FileProcessor
echo -e "\n=== Full process_file method ==="
sed -n '370,390p' backend/Generator/main.py
# Check if _allowed_file function exists and how it validates
echo -e "\n=== _allowed_file function ==="
rg -n '_allowed_file' --type=py -A 5Repository: AOSSIE-Org/EduAid
Length of output: 2676
Sanitize the uploaded filename to prevent path traversal attacks.
The _allowed_file function validates only the file extension, not the filename itself. A malicious filename like ../../etc/cron.d/evil.txt would pass the check (correct .txt extension) but cause FileProcessor.process_file() to write outside the upload folder, since it uses file.filename directly in os.path.join(self.upload_folder, file.filename) without sanitization.
Use werkzeug.utils.secure_filename to sanitize the filename:
Suggested fix
+from werkzeug.utils import secure_filename
+
`@app.route`("/upload", methods=["POST"])
def upload_file():
...
try:
+ uploaded_file.filename = secure_filename(uploaded_file.filename)
content = file_processor.process_file(uploaded_file)Alternatively, apply secure_filename inside FileProcessor.process_file() itself for defense-in-depth.
🧰 Tools
🪛 Ruff (0.14.14)
[error] 634-634: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
🤖 Prompt for AI Agents
In `@backend/server.py` around lines 609 - 636, The uploaded filename isn't
sanitized which allows path traversal despite _allowed_file checking extensions;
update the upload handling to sanitize filenames (use
werkzeug.utils.secure_filename) before passing to FileProcessor.process_file or,
better, apply secure_filename inside FileProcessor.process_file itself for
defense-in-depth; specifically, ensure upload_file uses secure_filename on
request.files["file"].filename (and/or modify FileProcessor.process_file to call
secure_filename on file.filename before any os.path.join with
self.upload_folder) and keep references to file_processor, upload_file,
_allowed_file, and FileProcessor.process_file when making the change.
- Fix potential NameError in /getTranscript by wrapping clean_transcript in try/except and removing unused expected_vtt variable - Fix input_questions (list) incorrectly passed as num_questions (int) in get_shortq_hard, get_mcq_hard, get_boolq_hard; fix type annotation in QuestionGenerator.generate() from bool to int - Add try/except to generate_gform, remove webbrowser.open_new_tab() server-side anti-pattern, return structured JSON response - Remove unused exception variable 'e' across all except blocks (Ruff F841) - Tighten glob pattern in getTranscript to use dot separator - Add strict=True to zip() in get_mcq_answer for defensive validation Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/server.py (1)
744-746:⚠️ Potential issue | 🟡 Minor
subtitlesdirectory creation only runs under__main__— production deployments may fail.When the app is served via Gunicorn/WSGI (not
python server.py),os.makedirs("subtitles", ...)never executes. The/getTranscriptendpoint will then fail whenyt-dlptries to write to the nonexistent directory.Move the directory creation to module level or app initialization.
Proposed fix
app = Flask(__name__) app.config["MAX_CONTENT_LENGTH"] = MAX_UPLOAD_SIZE_MB * 1024 * 1024 # 10 MB +os.makedirs("subtitles", exist_ok=True) ... if __name__ == "__main__": - os.makedirs("subtitles", exist_ok=True) app.run()
🤖 Fix all issues with AI agents
In `@backend/server.py`:
- Around line 596-602: The endpoint passes an invalid answer_style
("true_false") to qg.generate which calls QuestionGenerator.generate_qg_inputs
(which only accepts ["all", "sentences", "multiple_choice"]) causing a
ValueError; fix by changing the call in the endpoint where process_input_text
and qg.generate are used to pass a supported style (e.g., set
answer_style="sentences" to match get_shortq_hard) or, if true/false output is
required, add "true_false" handling inside QuestionGenerator.generate_qg_inputs
and its downstream logic; update references in the qg.generate call and ensure
tests cover the chosen behavior.
- Around line 317-323: The bug is that predict_boolean_answer expects
payload["input_question"] to be a list but the loop passes a string (question),
causing character-level iteration; fix by passing the question wrapped in a list
when calling answer.predict_boolean_answer (i.e., use {"input_text": input_text,
"input_question": [question]}) so predict_boolean_answer receives a list; update
the call site in the loop that builds output (where qa_response is produced) and
keep the existing logic that appends "True" or "False" based on qa_response.
- Around line 542-548: The call to QuestionGenerator.generate in server.py is
ignoring num_questions because generate only enforces num_questions when
use_evaluator=True; update the hard-question endpoints (get_shortq_hard,
get_mcq_hard, get_boolq_hard) to call qg.generate(..., num_questions=...,
use_evaluator=True, ...) so num_questions is honored (or alternatively add clear
docs/comments that num_questions is advisory); ensure you modify each endpoint
that invokes qg.generate to include use_evaluator=True and keep the existing
num_questions argument.
- Around line 598-605: qg.generate returns dicts like {"question": ...,
"answer": ...} but the code maps make_question_harder over the whole dicts
(variable generated) so make_question_harder receives a dict instead of a
string; change the mapping to call make_question_harder(item["question"]) for
each item in generated and preserve the original answer in the output (e.g.,
build harder_questions as [{"question": make_question_harder(item["question"]),
"answer": item["answer"]} for item in generated]) so the endpoint returns both
the hardened question and its original answer via jsonify.
🧹 Nitpick comments (1)
backend/Generator/main.py (1)
416-416: Type annotation fix is correct; consider makingOptionalexplicit.The
bool → intfix correctly reflects the numeric usage. However, PEP 484 prohibits implicitOptional—num_questions: int = Noneshould beint | None = None(orOptional[int] = None).Proposed fix
- num_questions: int = None, + num_questions: int | None = None,Or for Python < 3.10 compatibility:
from __future__ import annotations
- Fix critical bug: predict_boolean_answer receives a string instead of a list, causing character-level iteration; now wraps question in [question] - Fix critical bug: answer_style="true_false" is not a valid option in QuestionGenerator; changed to "sentences" in get_boolq_hard - Fix critical bug: make_question_harder in get_boolq_hard receives full dict instead of question string; aligned with other hard-question endpoints - Fix major issue: num_questions silently ignored because use_evaluator defaults to False; added use_evaluator=True to all 3 hard-question endpoints - Fix minor issue: subtitles directory creation moved from __main__ guard to module level so it works under Gunicorn/WSGI deployments - Fix nitpick: num_questions type annotation updated to int | None per PEP 484 Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/server.py (1)
386-515:⚠️ Potential issue | 🟡 MinorRequest-building logic is outside the
try/exceptblock — a malformedqa_pairwill cause an unhandledKeyError.Lines 386–515 access
qapair["question"],qapair["answer"], etc. without defensive checks. If any item inqa_pairsis missing a required key, aKeyErroris raised before reaching thetryblock at line 517, resulting in Flask's default 500 error (with a stack trace in debug mode) rather than the structured error response.Either move the request-building inside the existing
try/exceptor validateqa_pairsitem structure up front.Proposed fix (simplest: widen the try block)
+ try: requests_list = [] if question_type == "get_shortq": ... NEW_QUESTION = {"requests": requests_list} - try: result = form_service.forms().create(body=NEW_FORM).execute()
🤖 Fix all issues with AI agents
In `@backend/server.py`:
- Line 254: Remove the Python-3.10-only usage of zip(..., strict=True) in the
loop that iterates over input_questions and input_options; instead call
zip(input_questions, input_options) (or otherwise iterate in a
Python-version-compatible way) because the code already checks lengths earlier,
so drop the strict=True from the loop that references input_questions and
input_options in backend.server.py to avoid runtime errors on Python <3.10.
🧹 Nitpick comments (2)
backend/server.py (2)
9-9: Unused import:json
jsonis imported but never used in this file. The endpoints userequest.get_json()andjsonify()from Flask.Proposed fix
-import json
706-710: Uselogger.exceptioninstead oflogger.errorto include stack traces.In error-handling blocks where an exception is active,
logger.exceptionautomatically includes the traceback, which aids debugging. This aligns with the pattern used elsewhere in the file (e.g., lines 150, 174).Proposed fix
except subprocess.TimeoutExpired: - logger.error("yt-dlp timed out for video %s", video_id) + logger.exception("yt-dlp timed out for video %s", video_id) return jsonify({"error": "Transcript download timed out"}), 504 except subprocess.CalledProcessError as e: - logger.error("yt-dlp failed for video %s: %s", video_id, e.stderr) + logger.exception("yt-dlp failed for video %s: %s", video_id, e.stderr) return jsonify({"error": "Failed to download subtitles"}), 500
The length equality check above the loop already guarantees both lists are the same size, making strict=True redundant. Removing it avoids a runtime error on Python versions below 3.10. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Hey @zohaib-7035, first of all great work on this I was actually looking into these exact 500 errors today. I wrote a quick Python script to stress test the endpoints with about 40+ weird payloads (nulls, missing fields, wrong data types, etc.) and decided to run it against your branch to see how it holds up The good news: your PR is a massive improvement. It dropped the server crashes from 20 down to just 4 on my end. It perfectly catches all the missing input_text and empty JSON body issues and safely returns 400s now My script did catch 4 specific edge cases that still manage to bypass the validation and crash the server with a 500. Just dropping them here in case you want to patch them in this PR:
Solid foundation overall tho! Let me know if you want me to share the Python test script so you can test these locally |
Problem
The Flask backend (
server.py) currently lacks input validation, proper error handling, and security protections across all API endpoints. This leads to several issues:None, extremely long text, or invalid types without returning meaningful errors./getTranscriptendpoint passes unsanitizedvideoIddirectly to asubprocess.run()call without format validation./getTranscriptusesglob("subtitles/*.vtt")which can return another user's subtitle file during concurrent requests.print()statements instead of Python'sloggingmodule.Solution
Input Validation
_validate_input_text()helper — checks forNone, empty, non-string types, and enforces a 50,000 character max length_validate_max_questions()helper — clamps value to safe integer range[1, 20]_allowed_file()helper — whiteliststxt,pdf,docxextensions for uploadsgenerate_gformendpoint (qa_pairsandquestion_typefields)request.get_json(silent=True)to gracefully handle malformed JSONSecurity Hardening
^[a-zA-Z0-9_-]{11}$prevents command injectionyt-dlpcallsvideo_idinstead of using a global globMAX_CONTENT_LENGTHset to 10 MB with a413error handlerGOOGLE_SERVICE_ACCOUNT_FILEenv var for credential pathError Handling
try/exceptblocksLogging
print()statements with Pythonloggingmodule%(asctime)s [%(levelname)s] %(message)sTesting
400with clear error message400with length limit error400Changes
backend/server.py— 406 insertions, 158 deletionsSummary by CodeRabbit
New Features
Bug Fixes
Refactor