Add LLM request timeout and retry with exponential backoff#441
Open
Arijit429 wants to merge 5 commits intofireform-core:mainfrom
Open
Add LLM request timeout and retry with exponential backoff#441Arijit429 wants to merge 5 commits intofireform-core:mainfrom
Arijit429 wants to merge 5 commits intofireform-core:mainfrom
Conversation
- Add HTTPException handler for consistent error shape across all routes
- Add RequestValidationError handler with human-readable error messages
- Add catch-all Exception handler to prevent stack trace leakage
- Fix duplicate get_template() call in forms.py (was querying DB twice)
- Wrap Controller errors in AppError for safe client-facing messages
- All errors now return uniform {success, error: {code, message}} envelope
…file - Add GET /health liveness probe for Docker and container orchestration - Migrate database init from module-level to FastAPI lifespan context manager - Fix Dockerfile: start uvicorn server instead of tail -f /dev/null - Fix Dockerfile: correct PYTHONPATH from /app/src to /app - Add Docker HEALTHCHECK directive using /health endpoint - Add EXPOSE 8000 for container port documentation - Add FastAPI metadata (title, description, version) for API docs
- Enforce 20 MB max upload size (returns 413 if exceeded) - Validate PDF magic bytes to reject non-PDF files renamed to .pdf - Reject empty file uploads with clear 400 error - Add matching client-side size and empty file checks for instant UX feedback - Server-side validation is the security authority, client checks are UX only
- Add 120s timeout to prevent indefinite request hangs - Add retry logic (3 attempts) with exponential backoff (2s, 4s, 8s) - Retry on timeouts, connection errors, and 5xx server errors - Do not retry on 4xx client errors (permanent failures) - Extract _call_ollama() method for testability - Replace print() statements with structured logging - Add per-field logging for extraction debugging
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #228
Closes #277
Closes #13
Summary
Adds timeout, retry with exponential backoff, and structured logging to
the LLM extraction pipeline in
src/llm.py.Problem
The current
requests.post()call to Ollama has:blocking the server thread indefinitely
momentarily overloaded) kills the entire form fill permanently
print()with no log levels orfield context, making debugging impossible in production
Changes
src/llm.py_call_ollama()method for clean separation and testabilityprint()withlogging.getLogger("fireform.llm")Testing
Changes Summary
_call_ollama()methodReal-world impact
Ensures firefighter extraction requests never hang indefinitely when Ollama is
slow on the local machine — critical for reliable field use.