Skip to content

Conversation

@nishika26
Copy link
Collaborator

@nishika26 nishika26 commented Dec 9, 2025

Summary

Target issue is #426

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

  • Enhanced the send_callback function by adding callback url validation to check if the url is a proper public https url and not one of these - private, multicast, link-local (which covers cloud metadata endpoints), localhost/loopback and reserved
  • Also added session.trust_env=false for extra protection as it ignores environment proxies
  • Disables automatic redirect following in callback requests to prevent redirect-based SSRF attacks and ensure all redirect destinations are validated before connection.
  • Put a limit on callback response size which is configurable from env

Summary by CodeRabbit

  • New Features

    • Added configurable callback maximum response size (env/configurable).
  • Security Improvements

    • Enforced HTTPS-only callback URLs and blocked private/loopback/link-local/multicast destinations via DNS/IP checks.
    • Callbacks validated before sending and now use stricter handling (no redirects, streaming, size limits); send operations return success/failure.
  • Tests

    • Added comprehensive tests for callback URL validation, SSRF protections, and callback send behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@nishika26 nishika26 self-assigned this Dec 9, 2025
@coderabbitai
Copy link

coderabbitai bot commented Dec 9, 2025

Walkthrough

Adds SSRF protections and stricter callback handling: new IP/URL validation utilities, runtime validation calls in API endpoints before using callback URLs, removal of an old post_callback helper, and send_callback now streams responses, disables redirects, and enforces a max response size. New tests cover these behaviors.

Changes

Cohort / File(s) Summary
API route validation
backend/app/api/routes/collections.py, backend/app/api/routes/documents.py, backend/app/api/routes/llm.py
Import and invoke validate_callback_url when a callback URL is supplied at runtime; documents.py also adds a HttpUrl import.
Callback security & delivery
backend/app/utils.py
Add _is_private_ip(), validate_callback_url() (HTTPS-only, hostname required, DNS-resolved IP checks for loopback/private/link-local/multicast/reserved), and update send_callback() to pre-validate URLs, disable redirects, stream responses, enforce CALLBACK_MAX_RESPONSE_SIZE, and return a boolean status.
Core utilities removal
backend/app/core/util.py
Remove the old post_callback function and related HTTP request imports.
Configuration
backend/app/core/config.py, .env.example
Add CALLBACK_MAX_RESPONSE_SIZE: int = 1048576 to settings and expose CALLBACK_MAX_RESPONSE_SIZE in .env.example; update nearby comments.
Tests
backend/app/tests/utils/test_callback_ssrf.py
New tests for _is_private_ip, validate_callback_url, and send_callback (scheme checks, DNS-mocked IP checks, redirects disabled, streaming, timeout tuple, response-size enforcement, and error/HTTP handling).
sequenceDiagram
    participant Client as API Endpoint
    participant Validator as validate_callback_url
    participant DNS as DNS Resolver
    participant IPCheck as _is_private_ip
    participant Sender as send_callback
    participant Remote as External Service

    Client->>Validator: validate_callback_url(url)
    Validator->>Validator: ensure HTTPS & hostname present
    Validator->>DNS: resolve hostname
    DNS-->>Validator: return IP list
    loop for each IP
        Validator->>IPCheck: classify IP (private/loopback/link-local/multicast)
        IPCheck-->>Validator: blocked? / reason
    end
    alt any IP blocked
        Validator-->>Client: raise ValueError (SSRF blocked)
    else all IPs OK
        Validator-->>Client: validation passed
        Client->>Sender: send_callback(url, payload)
        Sender->>Validator: re-validate URL (safety)
        Sender->>Remote: POST (stream=True, allow_redirects=False, timeout=(..,..))
        Remote-->>Sender: response stream
        Sender->>Sender: enforce CALLBACK_MAX_RESPONSE_SIZE while streaming
        alt within limit
            Sender-->>Client: True (success)
        else exceeded
            Sender-->>Client: False (closed early)
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Focus review on backend/app/utils.py (DNS resolution, ipaddress usage, RFC1918/link-local/multicast logic, exact ValueError messages).
  • Verify send_callback uses stream=True, allow_redirects=False, a 2-tuple timeout, enforces CALLBACK_MAX_RESPONSE_SIZE, and closes/releases the response on early termination.
  • Confirm all API call sites now invoke validate_callback_url and that no references to the removed post_callback remain.
  • Check backend/app/core/config.py and .env.example for consistent naming and units.

Poem

🐰 I hopped through HTTPS fields with care,

I sniffed the names and chased each IP lair.
No private burrows let me through the gate,
I sip the stream and cap the byte-sized plate.
Safe callbacks now — hop, secure, and great! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main changes: adding callback URL validation, restricting redirection, and limiting response size—all central objectives of this security enhancement.
Docstring Coverage ✅ Passed Docstring coverage is 88.24% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhancement/callback_ssrf

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nishika26 nishika26 linked an issue Dec 9, 2025 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

❌ Patch coverage is 97.39777% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/utils.py 88.88% 5 Missing ⚠️
backend/app/api/routes/collections.py 80.00% 1 Missing ⚠️
backend/app/api/routes/documents.py 75.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
backend/app/core/config.py (1)

121-124: Centralized callback response size limit looks good (consider future configurability).

Defining CALLBACK_MAX_RESPONSE_SIZE here and using it in send_callback gives a clear, global control over max response size. If you expect different deployments to need different limits, consider wiring this through env settings later, but the current 1 MiB default is a solid starting point.

backend/app/api/routes/llm.py (1)

8-8: Callback URL validation adds SSRF protection; consider surfacing a clean HTTP error to clients.

Using validate_callback_url before start_job is the right place to block SSRF-style callbacks. Currently, a ValueError from validate_callback_url will bubble up; unless you already have a global ValueError handler, FastAPI will likely turn this into a 500. To keep the API surface predictable, you may want to wrap this in a small try/except and raise an HTTPException (e.g., 400/422) or convert to an APIResponse.failure_response instead.

Example:

 from app.utils import APIResponse, validate_callback_url
+from fastapi import HTTPException

 ...

-    if request.callback_url:
-        validate_callback_url(str(request.callback_url))
+    if request.callback_url:
+        try:
+            validate_callback_url(str(request.callback_url))
+        except ValueError as e:
+            raise HTTPException(status_code=400, detail=str(e))

Also applies to: 47-49

backend/app/api/routes/collections.py (1)

29-29: Consistent SSRF checks on collection callbacks; align error semantics with the rest of the API.

Adding validate_callback_url to both create_collection and delete_collection gives consistent SSRF protection for collection callbacks. As with the LLM route, these calls currently raise ValueError directly; without a global handler this will show up as a 500 to clients. Consider normalizing this to a 4xx via HTTPException or your standard error envelope so invalid callback URLs are clearly reported as client errors.

Pattern sketch:

-from app.utils import APIResponse, load_description, validate_callback_url
+from app.utils import APIResponse, load_description, validate_callback_url
+from fastapi import HTTPException

 ...

-    if request.callback_url:
-        validate_callback_url(str(request.callback_url))
+    if request.callback_url:
+        try:
+            validate_callback_url(str(request.callback_url))
+        except ValueError as e:
+            raise HTTPException(status_code=400, detail=str(e))

 ...

-    if request and request.callback_url:
-        validate_callback_url(str(request.callback_url))
+    if request and request.callback_url:
+        try:
+            validate_callback_url(str(request.callback_url))
+        except ValueError as e:
+            raise HTTPException(status_code=400, detail=str(e))

Also applies to: 84-86, 136-138

backend/app/utils.py (2)

342-392: send_callback hardening is good; consider minor polish and future extensibility.

The updated send_callback now:

  • Re-validates the URL via validate_callback_url.
  • Uses a requests.Session context manager.
  • Disables redirects and enables streaming.
  • Applies connect/read timeouts from settings.
  • Enforces a max response size via CALLBACK_MAX_RESPONSE_SIZE.

This is a solid improvement for SSRF and DoS resistance. A couple of small follow‑ups you might consider (not blocking):

  • If you want invalid callback URLs (from validate_callback_url) to be logged distinctly from network failures, you could catch ValueError separately and log a clearer message before re‑raising or returning False.
  • If you anticipate needing different size limits per deployment, wiring CALLBACK_MAX_RESPONSE_SIZE from env (as mentioned in Settings) would make tuning easier.

No changes required for behaviour; these are purely polish/observability ideas.


33-37: Modernize APIResponse type hints to use built‑in generics.

Ruff’s hint about typing.Dict being deprecated is valid here:

metadata: Optional[Dict[str, Any]] = None

In Python 3.11+, you can simplify this to:

-    metadata: Optional[Dict[str, Any]] = None
+    metadata: dict[str, Any] | None = None

and drop Dict/Optional imports if they become unused. Not urgent, but keeps types idiomatic.

backend/app/tests/utils/test_callback_ssrf.py (1)

1-383: Comprehensive SSRF and callback tests; only minor nit on patch scope.

This test module gives very thorough coverage of _is_private_ip, validate_callback_url, and send_callback (including DNS round‑robin, IPv6, metadata endpoints, timeouts, redirects, streaming, and size limits), which is exactly what this change needs.

One small optional improvement: for the validate_callback_url tests you currently patch socket.getaddrinfo:

@patch("socket.getaddrinfo")
def test_reject_localhost_by_name(self, mock_getaddrinfo):
    ...

Patching app.utils.socket.getaddrinfo instead would narrow the monkey‑patch to the code under test and reduce the chance of surprising interactions with other tests that also use socket:

@patch("app.utils.socket.getaddrinfo")
def test_reject_localhost_by_name(self, mock_getaddrinfo):
    ...

Not required, but a bit safer in a larger test suite.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 002b191 and d4e774b.

📒 Files selected for processing (7)
  • backend/app/api/routes/collections.py (3 hunks)
  • backend/app/api/routes/documents.py (4 hunks)
  • backend/app/api/routes/llm.py (2 hunks)
  • backend/app/core/config.py (1 hunks)
  • backend/app/core/util.py (0 hunks)
  • backend/app/tests/utils/test_callback_ssrf.py (1 hunks)
  • backend/app/utils.py (2 hunks)
💤 Files with no reviewable changes (1)
  • backend/app/core/util.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/api/routes/llm.py
  • backend/app/api/routes/collections.py
  • backend/app/api/routes/documents.py
  • backend/app/core/config.py
  • backend/app/tests/utils/test_callback_ssrf.py
  • backend/app/utils.py
backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Expose FastAPI REST endpoints under backend/app/api/ organized by domain

Files:

  • backend/app/api/routes/llm.py
  • backend/app/api/routes/collections.py
  • backend/app/api/routes/documents.py
backend/app/core/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place core functionality (config, DB session, security, exceptions, middleware) in backend/app/core/

Files:

  • backend/app/core/config.py
🧬 Code graph analysis (5)
backend/app/api/routes/llm.py (1)
backend/app/utils.py (1)
  • validate_callback_url (291-339)
backend/app/api/routes/collections.py (1)
backend/app/utils.py (1)
  • validate_callback_url (291-339)
backend/app/api/routes/documents.py (1)
backend/app/utils.py (2)
  • load_description (399-404)
  • validate_callback_url (291-339)
backend/app/tests/utils/test_callback_ssrf.py (1)
backend/app/utils.py (4)
  • _is_private_ip (268-288)
  • validate_callback_url (291-339)
  • send_callback (342-395)
  • _ (408-409)
backend/app/utils.py (1)
backend/app/api/routes/threads.py (1)
  • send_callback (36-48)
🪛 Ruff (0.14.8)
backend/app/utils.py

9-9: typing.Dict is deprecated, use dict instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)

@nishika26 nishika26 changed the title improve send_callback function improve send_callback function with callback url validation, restricting redirection and limiting response size Dec 9, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
.env.example (1)

83-86: Fix .env formatting to follow best practices.

The new environment variable has several formatting issues flagged by static analysis:

  1. Spaces around equal signs (lines 84-86 all have this issue)
  2. Unquoted value on line 86
  3. Variable ordering (CALLBACK_MAX_RESPONSE_SIZE should come before CALLBACK_READ_TIMEOUT alphabetically)

While lines 84-85 also have spaces (existing code), it's a good opportunity to fix all three lines to follow .env best practices.

Apply this diff to fix the formatting:

-# Callback Timeouts and size limit(in seconds and MB respectively)
-CALLBACK_CONNECT_TIMEOUT = 3
-CALLBACK_READ_TIMEOUT = 10
-CALLBACK_MAX_RESPONSE_SIZE = 1048576  #(1*1024*1024)
+# Callback Timeouts and size limit (in seconds and MB respectively)
+CALLBACK_CONNECT_TIMEOUT=3
+CALLBACK_MAX_RESPONSE_SIZE="1048576"
+CALLBACK_READ_TIMEOUT=10

Note: Added space after "limit" in the comment, removed spaces around =, quoted the numeric value, and reordered alphabetically.

backend/app/utils.py (1)

364-364: Remove redundant type cast.

The str() cast is unnecessary since callback_url is already typed as str in the function signature.

-        validate_callback_url(str(callback_url))
+        validate_callback_url(callback_url)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d4e774b and c95f148.

📒 Files selected for processing (3)
  • .env.example (1 hunks)
  • backend/app/api/routes/documents.py (3 hunks)
  • backend/app/utils.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/api/routes/documents.py
🧰 Additional context used
📓 Path-based instructions (2)
.env.example

📄 CodeRabbit inference engine (CLAUDE.md)

Provide .env.example as the template for .env

Files:

  • .env.example
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/utils.py
🧬 Code graph analysis (1)
backend/app/utils.py (1)
backend/app/api/routes/threads.py (1)
  • send_callback (36-48)
🪛 dotenv-linter (4.0.0)
.env.example

[warning] 84-84: [SpaceCharacter] The line has spaces around equal sign

(SpaceCharacter)


[warning] 85-85: [SpaceCharacter] The line has spaces around equal sign

(SpaceCharacter)


[warning] 86-86: [SpaceCharacter] The line has spaces around equal sign

(SpaceCharacter)


[warning] 86-86: [UnorderedKey] The CALLBACK_MAX_RESPONSE_SIZE key should go before the CALLBACK_READ_TIMEOUT key

(UnorderedKey)


[warning] 86-86: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)

🪛 Ruff (0.14.8)
backend/app/utils.py

9-9: typing.Dict is deprecated, use dict instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (2)
backend/app/utils.py (2)

268-289: LGTM! Return type annotation has been corrected.

The function correctly implements comprehensive IP blocking checks (loopback, link-local, multicast, private, reserved). The return type annotation tuple[bool, str] now matches the actual return values, addressing the past review comment.


366-389: Excellent SSRF protection measures!

The implementation demonstrates strong security practices:

  • Pre-request URL validation prevents SSRF attacks
  • allow_redirects=False blocks redirect-based SSRF
  • stream=True with chunked reading prevents memory exhaustion
  • Response size limit (enforced incrementally) protects against DoS via large responses

Note: The response body is read and discarded solely to enforce the size limit. While this consumes bandwidth, it's necessary to prevent attackers from exploiting unlimited response sizes.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
backend/app/utils.py (1)

291-339: Add DNS resolution timeout and normalize IPv6 addresses before IP checks.

The HTTPS-only enforcement and per-IP classification are solid, but there are two gaps:

  1. No timeout on socket.getaddrinfo (DoS risk).
    A slow/malicious DNS server can cause validate_callback_url to hang indefinitely. This was already flagged in a previous review and is still unresolved.

  2. IPv6 scope IDs may bypass blocking.
    getaddrinfo can return addresses like fe80::1%eth0; passing this directly to ipaddress.ip_address raises ValueError, and _is_private_ip then treats it as “not blocked”.

You can address both with something along these lines:

+from concurrent.futures import ThreadPoolExecutor, TimeoutError as FuturesTimeoutError
@@
-        addr_info = socket.getaddrinfo(
-            parsed.hostname,
-            parsed.port or 443,
-            socket.AF_UNSPEC,
-            socket.SOCK_STREAM,
-        )
+        # Resolve DNS with a bounded timeout to avoid hanging validation
+        with ThreadPoolExecutor(max_workers=1) as executor:
+            future = executor.submit(
+                socket.getaddrinfo,
+                parsed.hostname,
+                parsed.port or 443,
+                socket.AF_UNSPEC,
+                socket.SOCK_STREAM,
+            )
+            try:
+                addr_info = future.result(timeout=5.0)  # consider making this configurable
+            except FuturesTimeoutError:
+                raise ValueError("DNS resolution timeout for callback URL")
@@
-        for info in addr_info:
-            ip_address = info[4][0]
-            is_blocked, reason = _is_private_ip(ip_address)
+        for info in addr_info:
+            raw_ip = info[4][0]
+            # Strip IPv6 scope IDs like "fe80::1%eth0" before classification
+            ip_for_check = raw_ip.split("%", 1)[0]
+            is_blocked, reason = _is_private_ip(ip_for_check)
             if is_blocked:
                 raise ValueError(
-                    f"Callback URL resolves to {reason} IP address: {ip_address}. "
+                    f"Callback URL resolves to {reason} IP address: {raw_ip}. "
                     f"This IP type is not allowed for callbacks."
                 )

This keeps the existing validation semantics but makes the function resilient to DNS-based DoS and correctly handles IPv6 link-local addresses with scope IDs.

🧹 Nitpick comments (2)
backend/app/utils.py (2)

2-10: Use builtin dict[...] instead of typing.Dict (Py3.11 style).

New imports look good. To satisfy the Ruff hint and modern typing style, you can drop Dict and use builtin generics:

-from typing import Any, Dict, Generic, Optional, TypeVar
+from typing import Any, Generic, Optional, TypeVar
@@
 class APIResponse(BaseModel, Generic[T]):
@@
-    metadata: Optional[Dict[str, Any]] = None
+    metadata: Optional[dict[str, Any]] = None

Also applies to: 34-37


268-289: Consider also blocking “unspecified” addresses.

The checks for loopback, link-local, multicast, private, and reserved ranges look solid. You may also want to treat ip_obj.is_unspecified (e.g. 0.0.0.0, ::) as blocked to avoid callbacks to non-routable/placeholder addresses:

-        checks = [
+        checks = [
             (ip_obj.is_loopback, "loopback/localhost"),
             (ip_obj.is_link_local, "link-local"),
             (ip_obj.is_multicast, "multicast"),
             (ip_obj.is_private, "private"),
             (ip_obj.is_reserved, "reserved"),
+            (ip_obj.is_unspecified, "unspecified"),
         ]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c95f148 and 83a8e99.

📒 Files selected for processing (1)
  • backend/app/utils.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/utils.py
🧬 Code graph analysis (1)
backend/app/utils.py (1)
backend/app/api/routes/threads.py (1)
  • send_callback (36-48)
🪛 Ruff (0.14.8)
backend/app/utils.py

9-9: typing.Dict is deprecated, use dict instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
backend/app/utils.py (2)

268-289: _is_private_ip logic and typing look solid; consider optionally treating “unspecified” as blocked.

The function now returns tuple[bool, str] consistently with its callers, and the set of checks (loopback, link-local, multicast, private, reserved) is a good baseline for callback/IP classification. If you want to be extra strict, you could also treat ip_obj.is_unspecified as blocked (e.g., 0.0.0.0, ::) so that clearly non-routable addresses are rejected up front, but this is optional and not a blocker.


9-9: Optional: modernize Dict usage to built‑in dict for type hints.

Ruff’s UP035 hint applies here: for Python 3.11+ you can drop Dict entirely and rely on PEP‑585 style built‑ins, which you’re already using elsewhere (dict[str, Any] in send_callback):

-from typing import Any, Dict, Generic, Optional, TypeVar
+from typing import Any, Generic, Optional, TypeVar
@@
-    metadata: Optional[Dict[str, Any]] = None
+    metadata: Optional[dict[str, Any]] = None
@@
-        cls, data: T, metadata: Optional[Dict[str, Any]] = None
+        cls, data: T, metadata: Optional[dict[str, Any]] = None
@@
-        metadata: Optional[Dict[str, Any]] = None,
+        metadata: Optional[dict[str, Any]] = None,

This is non‑functional and can be done whenever you next touch this class.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83a8e99 and 1ba5150.

📒 Files selected for processing (1)
  • backend/app/utils.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/utils.py
🧬 Code graph analysis (1)
backend/app/utils.py (1)
backend/app/api/routes/threads.py (1)
  • send_callback (36-48)
🪛 Ruff (0.14.8)
backend/app/utils.py

9-9: typing.Dict is deprecated, use dict instead

(UP035)


394-394: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (2)
backend/app/utils.py (2)

2-10: New networking imports are appropriate and used correctly.

ipaddress, socket, and urlparse are all exercised in the new SSRF helpers and keep concerns local to this module; no issues from a dependency or usage standpoint.


291-339: validate_callback_url provides a good HTTPS-only, IP-based SSRF guardrail.

The URL parsing + HTTPS enforcement + hostname requirement + getaddrinfo over all resolved addresses, combined with _is_private_ip checks, gives you reasonable coverage against private/loopback/link-local/multicast/reserved targets and basic DNS‑rebind scenarios. The exception handling is also clear: callers see ValueError with a useful message for both policy rejections and unexpected resolution errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
backend/app/utils.py (2)

9-9: Use lowercase dict instead of deprecated typing.Dict.

For Python 3.9+, the built-in dict can be used directly in type annotations. The Dict import is unused after you've already updated send_callback to use dict[str, Any].

-from typing import Any, Dict, Generic, Optional, TypeVar
+from typing import Any, Generic, Optional, TypeVar

Then update remaining usages in APIResponse:

-    metadata: Optional[Dict[str, Any]] = None
+    metadata: Optional[dict[str, Any]] = None

363-367: Remove unnecessary str() conversion.

The callback_url parameter is already typed as str, so the str() conversion is redundant.

     try:
-        validate_callback_url(str(callback_url))
+        validate_callback_url(callback_url)
     except ValueError as ve:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ba5150 and 29cbaea.

📒 Files selected for processing (1)
  • backend/app/utils.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

  • backend/app/utils.py
🧬 Code graph analysis (1)
backend/app/utils.py (2)
backend/app/api/routes/threads.py (1)
  • send_callback (36-48)
backend/app/core/cloud/storage.py (2)
  • stream (125-127)
  • stream (179-195)
🪛 Ruff (0.14.8)
backend/app/utils.py

9-9: typing.Dict is deprecated, use dict instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (3)
backend/app/utils.py (3)

268-289: LGTM!

The IP validation logic is well-structured with comprehensive checks covering loopback, link-local, multicast, private, and reserved IP ranges. The return type annotation is correct, and gracefully handling ValueError for invalid IP strings is appropriate.


291-339: LGTM!

The SSRF validation is thorough: enforces HTTPS, validates hostname presence, resolves DNS, and checks all returned IP addresses against the private/reserved ranges. Checking all IPs from getaddrinfo provides protection against DNS round-robin attacks.


369-401: LGTM on the SSRF hardening!

The implementation addresses all key SSRF vectors:

  • session.trust_env = False prevents proxy-based SSRF
  • allow_redirects=False blocks redirect-based attacks
  • stream=True with chunked size checking prevents response-based DoS
  • Pre-request URL validation with IP resolution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve send_callback Function

2 participants