Skip to content

fix(backend): Add input validation, error handling, and security hardening to server.py#422

Open
zohaib-7035 wants to merge 4 commits into
AOSSIE-Org:mainfrom
zohaib-7035:fix/backend-validation-security
Open

fix(backend): Add input validation, error handling, and security hardening to server.py#422
zohaib-7035 wants to merge 4 commits into
AOSSIE-Org:mainfrom
zohaib-7035:fix/backend-validation-security

Conversation

@zohaib-7035
Copy link
Copy Markdown

@zohaib-7035 zohaib-7035 commented Feb 10, 2026

Problem

The Flask backend (server.py) currently lacks input validation, proper error handling, and security protections across all API endpoints. This leads to several issues:

  1. No input validation — Endpoints accept empty strings, None, extremely long text, or invalid types without returning meaningful errors.
  2. No error handling — When question generation fails, the server returns raw 500 errors with Python stack traces leaked to the client.
  3. Command injection risk — The /getTranscript endpoint passes unsanitized videoId directly to a subprocess.run() call without format validation.
  4. Race condition/getTranscript uses glob("subtitles/*.vtt") which can return another user's subtitle file during concurrent requests.
  5. No upload limits — File uploads have no size restrictions or extension validation.
  6. No structured logging — Uses bare print() statements instead of Python's logging module.
  7. Hardcoded credential paths — No environment variable fallback for Google service account file.

Solution

Input Validation

  • Added _validate_input_text() helper — checks for None, empty, non-string types, and enforces a 50,000 character max length
  • Added _validate_max_questions() helper — clamps value to safe integer range [1, 20]
  • Added _allowed_file() helper — whitelists txt, pdf, docx extensions for uploads
  • Added validation to generate_gform endpoint (qa_pairs and question_type fields)
  • All endpoints now use request.get_json(silent=True) to gracefully handle malformed JSON

Security Hardening

  • Video ID validation: Strict regex pattern ^[a-zA-Z0-9_-]{11}$ prevents command injection
  • Subprocess timeout: Added 30-second timeout to yt-dlp calls
  • Race condition fix: Subtitle files are now scoped to the specific video_id instead of using a global glob
  • Upload size limit: Flask MAX_CONTENT_LENGTH set to 10 MB with a 413 error handler
  • Environment variable support: GOOGLE_SERVICE_ACCOUNT_FILE env var for credential path

Error Handling

  • All endpoints wrapped in try/except blocks
  • User-friendly error messages returned (no stack traces leaked)
  • Proper HTTP status codes (400 for validation, 500 for server errors, 504 for timeouts)

Logging

  • Replaced all print() statements with Python logging module
  • Structured log format: %(asctime)s [%(levelname)s] %(message)s
  • Info-level logs for successful operations, error/exception logs for failures

Testing

  • Verified all existing endpoints still work correctly with valid input
  • Tested empty input returns 400 with clear error message
  • Tested oversized input returns 400 with length limit error
  • Tested invalid video ID format returns 400
  • Tested malformed JSON body is handled gracefully

Changes

  • backend/server.py — 406 insertions, 158 deletions
  • No new dependencies added
  • Fully backward-compatible with existing frontend

Summary by CodeRabbit

  • New Features

    • Optional Google Docs content import and Google Forms generation for quizzes.
    • New "hard" variants for MCQ, Boolean, and short-answer questions.
  • Bug Fixes

    • Stronger input validation, clamped question counts, clearer error responses for QA endpoints.
    • Safer uploads, transcript retrieval with per-video isolation, timeouts, subtitle cleanup.
    • Requests too large now return a clear 413 error.
  • Refactor

    • Improved logging, centralized validation, scoped file handling, and startup readiness.

…ening to server.py

Problem:
- No input validation on any API endpoint (empty text, huge payloads, invalid types)
- Missing error handling causes 500 errors with stack traces leaked to clients
- /getTranscript endpoint vulnerable to command injection via unsanitized video_id
- /getTranscript has race condition when concurrent requests use glob on shared folder
- File uploads have no size limits or extension validation
- No structured logging (bare print statements)
- Hardcoded credential paths with no env variable fallback

Solution:
- Add _validate_input_text() helper: checks for None, empty, type, and max length (50k chars)
- Add _validate_max_questions() helper: clamps to safe range [1, 20]
- Add _allowed_file() helper: whitelist txt/pdf/docx extensions
- Validate video_id against strict regex pattern (alphanumeric, 11 chars) before subprocess
- Add 30s timeout to yt-dlp subprocess call
- Scope subtitle files to video_id to prevent race conditions between concurrent requests
- Add Flask MAX_CONTENT_LENGTH (10 MB) with 413 error handler
- Wrap all endpoints in try/except with user-friendly error messages (no stack traces)
- Replace print() with Python logging module throughout
- Use os.environ.get() for Google service account file path
- Add input validation to generate_gform (qa_pairs and question_type)
- Use request.get_json(silent=True) to handle malformed JSON gracefully

Co-authored-by: Cursor <cursoragent@cursor.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 10, 2026

Warning

Rate limit exceeded

@zohaib-7035 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 50 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Centralized input validation and logging; guarded Google Docs/Forms integration; MediaWiki optional expansion; robust transcript/upload handling with timeouts and scoped file cleanup; added hard-question endpoints and stricter error responses; changed QuestionGenerator.generate to accept an int | None for num_questions.

Changes

Cohort / File(s) Summary
Server: core & helpers
backend/server.py
Added _validate_input_text, _validate_max_questions, _allowed_file; centralized validation, clamped max_questions, global 413 handler, startup readiness, improved logging, and safer filesystem/scoped paths.
Server: generation, answers & hard variants
backend/server.py
Reworked MCQ/BoolQ/ShortQ/get_problems and answer endpoints to use validated inputs, consistent 4xx/5xx JSON responses, return empty lists on invalid input; added *_hard endpoints that expand via QG/evaluator and apply harder transformations; per-question QA/TF-answer logic with structured errors and logging.
Server: external integrations & media
backend/server.py
Guarded Google Docs/Forms initialization with 503 when unavailable; get_content stricter URL checks; yt-dlp transcript retrieval per-video with timeouts, explicit subtitle handling, and cleanup; upload flow tightened with file validation and detailed logging.
Generator API change
backend/Generator/main.py
Public signature changed: `QuestionGenerator.generate(..., num_questions: int

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Server as Server
  participant QAModel as QA_Model
  participant GoogleDocs as Google_Docs
  participant MediaWiki as MediaWiki
  participant YouTube as YouTube

  Client->>Server: POST /get_mcq (input_text, max_questions, options)
  alt use_mediawiki or hard variant
    Server->>MediaWiki: expand(article)
    MediaWiki-->>Server: expanded_text
  end
  alt use_google_docs
    Server->>GoogleDocs: fetch(document_url)
    GoogleDocs-->>Server: document_text / 503 error
  end
  alt transcript requested
    Server->>YouTube: fetch/transcribe(video_id) with timeout
    YouTube-->>Server: transcript or 4xx/5xx
  end
  Server->>QAModel: generate(article_text, num_questions, use_evaluator)
  QAModel-->>Server: questions (+answers)
  Server-->>Client: 200 JSON (questions, answers) or 4xx/5xx
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 I hopped through lines with careful care,

Validations snug, logs everywhere.
Docs and subs cleaned, questions made bold,
Harder quizzes crafted, temp files controlled.
A little rabbit cheers — neat, safe, and gold.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding input validation, error handling, and security hardening to server.py, which aligns with the comprehensive changes documented in the raw summary.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/server.py (2)

355-527: ⚠️ Potential issue | 🟠 Major

generate_gform is missing try/except wrapping, and webbrowser.open_new_tab is a server-side anti-pattern.

While all other endpoints were hardened with try/except + generic error messages, generate_gform has no exception handling around the Google Forms API calls (lines 518–527). A failure here will leak a full stack trace to the client.

Additionally, webbrowser.open_new_tab(...) on line 524 attempts to open a browser on the server machine, which is meaningless (and will fail silently or error) in any deployed environment. The edit URL should be returned to the client in the response instead.

Finally, jsonify(result["responderUri"]) on line 523 wraps a plain string, producing a bare JSON string response rather than a structured object — inconsistent with every other endpoint.

Suggested fix (lines 518–527)
-    result = form_service.forms().create(body=NEW_FORM).execute()
-    form_service.forms().batchUpdate(
-        formId=result["formId"], body=NEW_QUESTION
-    ).execute()
-
-    edit_url = jsonify(result["responderUri"])
-    webbrowser.open_new_tab(
-        "https://docs.google.com/forms/d/" + result["formId"] + "/edit"
-    )
-    return edit_url
+    try:
+        result = form_service.forms().create(body=NEW_FORM).execute()
+        form_service.forms().batchUpdate(
+            formId=result["formId"], body=NEW_QUESTION
+        ).execute()
+
+        return jsonify({
+            "responder_url": result["responderUri"],
+            "edit_url": f"https://docs.google.com/forms/d/{result['formId']}/edit",
+        })
+    except Exception:
+        logger.exception("Error creating Google Form")
+        return jsonify({"error": "Failed to create Google Form"}), 500

366-381: ⚠️ Potential issue | 🟠 Major

OAuth interactive flow (tools.run_flow) will not work on a headless server.

tools.run_flow(flow, store) on line 373 launches an interactive browser-based OAuth consent flow. This is designed for local CLI tools, not for a web server handling API requests. On a deployed server, this will either hang or crash.

This is pre-existing code, but since the PR is hardening this endpoint (adding input validation on lines 357–365), it would be worth addressing — or at minimum wrapping in a try/except and returning an appropriate error.

🤖 Fix all issues with AI agents
In `@backend/server.py`:
- Around line 680-725: The code can raise a NameError if
clean_transcript(subtitle_file) fails because transcript_text is referenced
after the finally cleanup; also expected_vtt is unused. Wrap the call to
clean_transcript in a try/except that catches Exception, logs the error via
logger (include video_id and exception), returns an appropriate error response
(e.g., 500) and still cleans up files in the finally block; remove the unused
expected_vtt assignment entirely. Specifically, modify the block that calls
clean_transcript(subtitle_file) so transcript_text is only referenced after a
successful call (or set a safe default only after success), handle exceptions
from clean_transcript with logger.error and jsonify error response, and keep the
existing cleanup loop over matching_files.
- Around line 609-636: The uploaded filename isn't sanitized which allows path
traversal despite _allowed_file checking extensions; update the upload handling
to sanitize filenames (use werkzeug.utils.secure_filename) before passing to
FileProcessor.process_file or, better, apply secure_filename inside
FileProcessor.process_file itself for defense-in-depth; specifically, ensure
upload_file uses secure_filename on request.files["file"].filename (and/or
modify FileProcessor.process_file to call secure_filename on file.filename
before any os.path.join with self.upload_folder) and keep references to
file_processor, upload_file, _allowed_file, and FileProcessor.process_file when
making the change.
- Around line 541-554: The bug is passing a list (input_questions) as
num_questions to qg.generate and mis-typing the generate() signature; ensure
num_questions is an integer before calling qg.generate and fix the type
annotation. Convert/coerce the request value (input_questions) to an int (e.g.,
parse int if a string, or use len(...) if a list was passed intentionally) and
pass that integer to qg.generate(article=..., num_questions=that_int, ...);
update every call site that currently passes input_questions (the calls to
qg.generate at the places noted and the other two occurrences) and change the
generate(...) signature from num_questions: bool = None to num_questions: int =
None and any internal assumptions in _get_ranked_qa_pairs to use the integer
type.
🧹 Nitpick comments (3)
backend/server.py (3)

706-711: Glob pattern is slightly broader than intended.

glob.glob(f"subtitles/{video_id}*.vtt") would also match files whose names start with the same 11 characters but have additional characters before .vtt (e.g., a hypothetical {video_id}xyz.vtt). Since yt-dlp outputs files like {video_id}.en.vtt, a more precise pattern would be:

-    matching_files = glob.glob(f"subtitles/{video_id}*.vtt")
+    matching_files = glob.glob(f"subtitles/{video_id}.*.vtt")

This is low risk given the validated 11-char ID, but tighter is better for correctness.


131-152: Unused exception variable e across all endpoints.

Since logger.exception(...) automatically captures and logs the exception info, the as e binding is unused in every except Exception as e clause. Ruff flags this (F841) on lines 150, 174, 198, 231, 276, 301, 328, 350, 552, 579, 605, 634.

This is a simple cleanup:

-    except Exception as e:
+    except Exception:

Also applies to: 155-176, 179-200


246-258: Consider adding strict=True to zip() for defensive validation.

Line 249 already checks len(input_questions) != len(input_options), so the lists should always be equal length when reaching line 255. Adding strict=True to zip() would serve as a safety net and satisfies the Ruff B905 lint:

-        for question, options in zip(input_questions, input_options):
+        for question, options in zip(input_questions, input_options, strict=True):

Comment thread backend/server.py
Comment thread backend/server.py
Comment on lines +609 to +636
@app.route("/upload", methods=["POST"])
def upload_file():
if 'file' not in request.files:
return jsonify({"error": "No file part"}), 400
if "file" not in request.files:
return jsonify({"error": "No file part in the request"}), 400

file = request.files['file']
uploaded_file = request.files["file"]

if file.filename == '':
return jsonify({"error": "No selected file"}), 400
if uploaded_file.filename == "":
return jsonify({"error": "No file selected"}), 400

content = file_processor.process_file(file)

if content:
return jsonify({"content": content})
else:
return jsonify({"error": "Unsupported file type or error processing file"}), 400
if not _allowed_file(uploaded_file.filename):
return (
jsonify({
"error": f"Unsupported file type. Allowed types: {', '.join(ALLOWED_UPLOAD_EXTENSIONS)}"
}),
400,
)

try:
content = file_processor.process_file(uploaded_file)
if content:
logger.info("Processed uploaded file: %s", uploaded_file.filename)
return jsonify({"content": content})
else:
return jsonify({"error": "Could not extract content from the file"}), 400
except Exception as e:
logger.exception("Error processing uploaded file")
return jsonify({"error": "Failed to process the uploaded file"}), 500
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for secure_filename usage
rg -n 'secure_filename' --type=py

# Also search for FileProcessor class and process_file method
echo "=== FileProcessor.process_file ==="
rg -n 'class FileProcessor' --type=py -A 20

# Check backend/server.py to verify the upload handler
echo "=== Checking backend/server.py line count ==="
wc -l backend/server.py

# Check imports in backend/server.py
echo "=== Imports in backend/server.py ==="
head -50 backend/server.py

Repository: AOSSIE-Org/EduAid

Length of output: 3124


🏁 Script executed:

#!/bin/bash
# Verify lines 609-636 in backend/server.py
echo "=== backend/server.py lines 609-636 ==="
sed -n '609,636p' backend/server.py

# Check full process_file method in FileProcessor
echo -e "\n=== Full process_file method ==="
sed -n '370,390p' backend/Generator/main.py

# Check if _allowed_file function exists and how it validates
echo -e "\n=== _allowed_file function ==="
rg -n '_allowed_file' --type=py -A 5

Repository: AOSSIE-Org/EduAid

Length of output: 2676


Sanitize the uploaded filename to prevent path traversal attacks.

The _allowed_file function validates only the file extension, not the filename itself. A malicious filename like ../../etc/cron.d/evil.txt would pass the check (correct .txt extension) but cause FileProcessor.process_file() to write outside the upload folder, since it uses file.filename directly in os.path.join(self.upload_folder, file.filename) without sanitization.

Use werkzeug.utils.secure_filename to sanitize the filename:

Suggested fix
+from werkzeug.utils import secure_filename
+
 `@app.route`("/upload", methods=["POST"])
 def upload_file():
     ...
     try:
+        uploaded_file.filename = secure_filename(uploaded_file.filename)
         content = file_processor.process_file(uploaded_file)

Alternatively, apply secure_filename inside FileProcessor.process_file() itself for defense-in-depth.

🧰 Tools
🪛 Ruff (0.14.14)

[error] 634-634: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

🤖 Prompt for AI Agents
In `@backend/server.py` around lines 609 - 636, The uploaded filename isn't
sanitized which allows path traversal despite _allowed_file checking extensions;
update the upload handling to sanitize filenames (use
werkzeug.utils.secure_filename) before passing to FileProcessor.process_file or,
better, apply secure_filename inside FileProcessor.process_file itself for
defense-in-depth; specifically, ensure upload_file uses secure_filename on
request.files["file"].filename (and/or modify FileProcessor.process_file to call
secure_filename on file.filename before any os.path.join with
self.upload_folder) and keep references to file_processor, upload_file,
_allowed_file, and FileProcessor.process_file when making the change.

Comment thread backend/server.py
- Fix potential NameError in /getTranscript by wrapping clean_transcript
  in try/except and removing unused expected_vtt variable
- Fix input_questions (list) incorrectly passed as num_questions (int)
  in get_shortq_hard, get_mcq_hard, get_boolq_hard; fix type annotation
  in QuestionGenerator.generate() from bool to int
- Add try/except to generate_gform, remove webbrowser.open_new_tab()
  server-side anti-pattern, return structured JSON response
- Remove unused exception variable 'e' across all except blocks (Ruff F841)
- Tighten glob pattern in getTranscript to use dot separator
- Add strict=True to zip() in get_mcq_answer for defensive validation

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/server.py (1)

744-746: ⚠️ Potential issue | 🟡 Minor

subtitles directory creation only runs under __main__ — production deployments may fail.

When the app is served via Gunicorn/WSGI (not python server.py), os.makedirs("subtitles", ...) never executes. The /getTranscript endpoint will then fail when yt-dlp tries to write to the nonexistent directory.

Move the directory creation to module level or app initialization.

Proposed fix
 app = Flask(__name__)
 app.config["MAX_CONTENT_LENGTH"] = MAX_UPLOAD_SIZE_MB * 1024 * 1024  # 10 MB
+os.makedirs("subtitles", exist_ok=True)
 
 ...
 
 if __name__ == "__main__":
-    os.makedirs("subtitles", exist_ok=True)
     app.run()
🤖 Fix all issues with AI agents
In `@backend/server.py`:
- Around line 596-602: The endpoint passes an invalid answer_style
("true_false") to qg.generate which calls QuestionGenerator.generate_qg_inputs
(which only accepts ["all", "sentences", "multiple_choice"]) causing a
ValueError; fix by changing the call in the endpoint where process_input_text
and qg.generate are used to pass a supported style (e.g., set
answer_style="sentences" to match get_shortq_hard) or, if true/false output is
required, add "true_false" handling inside QuestionGenerator.generate_qg_inputs
and its downstream logic; update references in the qg.generate call and ensure
tests cover the chosen behavior.
- Around line 317-323: The bug is that predict_boolean_answer expects
payload["input_question"] to be a list but the loop passes a string (question),
causing character-level iteration; fix by passing the question wrapped in a list
when calling answer.predict_boolean_answer (i.e., use {"input_text": input_text,
"input_question": [question]}) so predict_boolean_answer receives a list; update
the call site in the loop that builds output (where qa_response is produced) and
keep the existing logic that appends "True" or "False" based on qa_response.
- Around line 542-548: The call to QuestionGenerator.generate in server.py is
ignoring num_questions because generate only enforces num_questions when
use_evaluator=True; update the hard-question endpoints (get_shortq_hard,
get_mcq_hard, get_boolq_hard) to call qg.generate(..., num_questions=...,
use_evaluator=True, ...) so num_questions is honored (or alternatively add clear
docs/comments that num_questions is advisory); ensure you modify each endpoint
that invokes qg.generate to include use_evaluator=True and keep the existing
num_questions argument.
- Around line 598-605: qg.generate returns dicts like {"question": ...,
"answer": ...} but the code maps make_question_harder over the whole dicts
(variable generated) so make_question_harder receives a dict instead of a
string; change the mapping to call make_question_harder(item["question"]) for
each item in generated and preserve the original answer in the output (e.g.,
build harder_questions as [{"question": make_question_harder(item["question"]),
"answer": item["answer"]} for item in generated]) so the endpoint returns both
the hardened question and its original answer via jsonify.
🧹 Nitpick comments (1)
backend/Generator/main.py (1)

416-416: Type annotation fix is correct; consider making Optional explicit.

The bool → int fix correctly reflects the numeric usage. However, PEP 484 prohibits implicit Optionalnum_questions: int = None should be int | None = None (or Optional[int] = None).

Proposed fix
-        num_questions: int = None,
+        num_questions: int | None = None,

Or for Python < 3.10 compatibility:

from __future__ import annotations

Comment thread backend/server.py Outdated
Comment thread backend/server.py
Comment thread backend/server.py
Comment thread backend/server.py Outdated
- Fix critical bug: predict_boolean_answer receives a string instead of
  a list, causing character-level iteration; now wraps question in [question]
- Fix critical bug: answer_style="true_false" is not a valid option in
  QuestionGenerator; changed to "sentences" in get_boolq_hard
- Fix critical bug: make_question_harder in get_boolq_hard receives full
  dict instead of question string; aligned with other hard-question endpoints
- Fix major issue: num_questions silently ignored because use_evaluator
  defaults to False; added use_evaluator=True to all 3 hard-question endpoints
- Fix minor issue: subtitles directory creation moved from __main__ guard
  to module level so it works under Gunicorn/WSGI deployments
- Fix nitpick: num_questions type annotation updated to int | None per PEP 484

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/server.py (1)

386-515: ⚠️ Potential issue | 🟡 Minor

Request-building logic is outside the try/except block — a malformed qa_pair will cause an unhandled KeyError.

Lines 386–515 access qapair["question"], qapair["answer"], etc. without defensive checks. If any item in qa_pairs is missing a required key, a KeyError is raised before reaching the try block at line 517, resulting in Flask's default 500 error (with a stack trace in debug mode) rather than the structured error response.

Either move the request-building inside the existing try/except or validate qa_pairs item structure up front.

Proposed fix (simplest: widen the try block)
+    try:
         requests_list = []
         if question_type == "get_shortq":
             ...
         NEW_QUESTION = {"requests": requests_list}

-    try:
         result = form_service.forms().create(body=NEW_FORM).execute()
🤖 Fix all issues with AI agents
In `@backend/server.py`:
- Line 254: Remove the Python-3.10-only usage of zip(..., strict=True) in the
loop that iterates over input_questions and input_options; instead call
zip(input_questions, input_options) (or otherwise iterate in a
Python-version-compatible way) because the code already checks lengths earlier,
so drop the strict=True from the loop that references input_questions and
input_options in backend.server.py to avoid runtime errors on Python <3.10.
🧹 Nitpick comments (2)
backend/server.py (2)

9-9: Unused import: json

json is imported but never used in this file. The endpoints use request.get_json() and jsonify() from Flask.

Proposed fix
-import json

706-710: Use logger.exception instead of logger.error to include stack traces.

In error-handling blocks where an exception is active, logger.exception automatically includes the traceback, which aids debugging. This aligns with the pattern used elsewhere in the file (e.g., lines 150, 174).

Proposed fix
     except subprocess.TimeoutExpired:
-        logger.error("yt-dlp timed out for video %s", video_id)
+        logger.exception("yt-dlp timed out for video %s", video_id)
         return jsonify({"error": "Transcript download timed out"}), 504
     except subprocess.CalledProcessError as e:
-        logger.error("yt-dlp failed for video %s: %s", video_id, e.stderr)
+        logger.exception("yt-dlp failed for video %s: %s", video_id, e.stderr)
         return jsonify({"error": "Failed to download subtitles"}), 500

Comment thread backend/server.py Outdated
The length equality check above the loop already guarantees both lists
are the same size, making strict=True redundant. Removing it avoids a
runtime error on Python versions below 3.10.

Co-authored-by: Cursor <cursoragent@cursor.com>
@AmazingDude
Copy link
Copy Markdown

Hey @zohaib-7035, first of all great work on this

I was actually looking into these exact 500 errors today. I wrote a quick Python script to stress test the endpoints with about 40+ weird payloads (nulls, missing fields, wrong data types, etc.) and decided to run it against your branch to see how it holds up

The good news: your PR is a massive improvement. It dropped the server crashes from 20 down to just 4 on my end. It perfectly catches all the missing input_text and empty JSON body issues and safely returns 400s now

My script did catch 4 specific edge cases that still manage to bypass the validation and crash the server with a 500. Just dropping them here in case you want to patch them in this PR:

  1. /get_boolq crashes if max_questions is too high (e.g passing 20). The model throws ValueError: num_return_sequences (20) has to be smaller or equal to num_beams (10)

  2. /get_shortq crashes if the body is a JSON array [] instead of an object {}. It bypasses the checks and throws AttributeError: 'list' object has no attribute 'get'

  3. /get_shortq_answer throws a pipeline ValueError if there is a null value inside the questions list

  4. /get_boolq_hard crashes with TypeError: object of type 'int' has no len() if an integer is passed where it expects a list

Solid foundation overall tho! Let me know if you want me to share the Python test script so you can test these locally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants