Implement Unified LLM API v1 to Simplify LLM Integrations and Enable Multi-Provider Extensibility #413

avirajsingh7 · 2025-10-21T03:16:02Z

Summary

Target issue is #409

This PR introduces the Unified LLM API v1, which decouples the LLM request flow from the existing assistants table and provides a flexible, provider-agnostic interface for making LLM calls.
It lays the groundwork for multi-provider support while initially focusing on OpenAI models.

Key Features

New Endpoint
POST /v1/llm/call, Allows direct LLM interaction without assistant dependency.
Dynamic Configuration Model
Accepts inline configuration or references existing stored configs (id, version --> after config management).
Flexible Query Handling
Supports both one-off and threaded conversations with auto-generated conversation_id if not provided.
Provider-Agnostic Design
Configuration accepts any provider (e.g., openai, anthropic) with extensibility for future integrations.
Template-Based Parameter Merging:
Merges static config templates with dynamic query parameters before making provider API calls.

Example Request

{
  "query": {
    "input": "string",
    "conversation": {
      "id": "string",
      "auto_create": false
    }
  },
  "config": {
    "completion": {
      "provider": "openai",
      "params": {
              "model":"gpt-4o"
       }
    }
  },
  "callback_url": "https://example.com/",
  "include_provider_response": false
}

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Summary by CodeRabbit

New Features
- Public LLM API endpoint with async background job processing, callback handling, job status tracking, new LLM job type, provider registry and selection
- Standardized LLM request/response models, provider base and OpenAI provider, and package re-exports for provider APIs
Tests
- Extensive unit/integration tests covering provider behavior, registry, job lifecycle, callbacks, and test utilities
Chores
- Database migration adding LLM job type
- Bumped OpenAI dependency constraint

coderabbitai · 2025-10-21T03:16:11Z

Walkthrough

Adds LLM support: new /llm/call API and callback handler, LLM request/response models, provider abstraction with OpenAI implementation and registry, Celery-backed LLM job lifecycle (start/execute/handle errors), Alembic enum migration adding LLM_API, re-exports, and tests/mocks.

Changes

Cohort / File(s)	Summary
Database & Job enum `\`backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py``,` `backend/app/models/job.py``	Adds Alembic migration to extend Postgres enum with `LLM_API` (upgrade uses IF NOT EXISTS); `JobType` enum extended with `LLM_API`. Downgrade left as no-op with manual rollback note.
API Routes `\`backend/app/api/main.py``,` `backend/app/api/routes/llm.py``	Registers LLM routes; implements `/llm/call` endpoint that enqueues LLM jobs and a dynamic callback router/handler for job callbacks.
LLM Request models `\`backend/app/models/llm/request.py``	New request/config models: `ConversationConfig`, `QueryParams`, `CompletionConfig`, `LLMCallConfig`, `LLMCallRequest` with validation (mutual exclusivity) and metadata.
LLM Response models `\`backend/app/models/llm/response.py``	New response models: `Usage`, `LLMOutput`, `LLMResponse`, `LLMCallResponse` including optional `provider_raw_response`.
Model exports `\`backend/app/models/init.py``,` `backend/app/models/llm/init.py``	Re-exports updated to expose new LLM models and config classes.
Provider abstraction & exports `\`backend/app/services/llm/providers/base.py``,` `backend/app/services/llm/providers/init.py``,` `backend/app/services/llm/init.py`**	Adds `BaseProvider` abstract class; re-exports provider symbols and registry utilities (`LLMProvider`, `get_llm_provider`, `BaseProvider`, `OpenAIProvider`).
OpenAI provider `\`backend/app/services/llm/providers/openai.py``	Implements `OpenAIProvider` with request assembly, conversation handling (existing/auto-create/none), response normalization, optional raw response, and error mapping.
Provider registry `\`backend/app/services/llm/providers/registry.py``	Adds registry (`LLMProvider`) and `get_llm_provider()` to resolve provider class, fetch credentials, construct provider client (OpenAI) and return provider instance.
Job orchestration `\`backend/app/services/llm/jobs.py``	Adds `start_job()`, `execute_job()` (Celery task), and `handle_job_error()` to create jobs, run providers, invoke callbacks, and update job states (CREATED → PROCESSING → SUCCESS/FAILED).
Tests — routes & utils `\`backend/app/tests/api/routes/test_llm.py``,` `backend/app/tests/utils/openai.py``	Adds test for `/llm/call` endpoint; updates `mock_openai_response()` to accept `conversation_id`.
Tests — providers & registry `\`backend/app/tests/services/llm/providers/test_openai.py``,` `backend/app/tests/services/llm/providers/test_registry.py``	Adds extensive tests for OpenAI provider behavior (conversation flows, errors, raw response) and registry credential/client resolution.
Tests — jobs `\`backend/app/tests/services/llm/test_jobs.py` ``	Comprehensive tests for `start_job`, `execute_job`, and `handle_job_error` covering success, failure, and callback scenarios.
Dependencies `\`backend/pyproject.toml``	Bumps `openai` dependency from `>=1.67.0` to `>=1.100.1`.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant API as FastAPI (/llm/call)
    participant JobSvc as start_job()
    participant JobDB as Job DB
    participant Queue as Celery Queue
    participant Task as execute_job (Celery)
    participant Reg as ProviderRegistry
    participant Prov as OpenAIProvider
    participant Callback as Callback URL

    User->>API: POST /llm/call (LLMCallRequest)
    API->>JobSvc: start_job(request, project_id, org_id)
    JobSvc->>JobDB: create Job (LLM_API, PENDING)
    JobSvc->>Queue: enqueue execute_job(...)
    JobSvc-->>API: return job_id
    API-->>User: 200 OK (job scheduled)

    Queue->>Task: run execute_job
    Task->>JobDB: set status PROCESSING
    Task->>Reg: get_llm_provider(provider_type,...)
    Reg-->>Task: Provider instance
    Task->>Prov: execute(config, query)
    Prov->>Prov: call OpenAI API (conversation handling)
    Prov-->>Task: LLMCallResponse / error
    alt success
        Task->>JobDB: set status SUCCESS
        opt callback_url present
            Task->>Callback: POST APIResponse[LLMCallResponse]
        end
    else error
        Task->>JobDB: set status FAILED
        opt callback_url present
            Task->>Callback: POST APIResponse[error]
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Areas needing extra attention:
- backend/app/services/llm/jobs.py — Celery task signatures, DB state transitions, callback invocation and error flows.
- backend/app/services/llm/providers/openai.py — parameter merging, conversation creation/usage, and exception mapping.
- backend/app/services/llm/providers/registry.py — credential retrieval and client construction.
- Tests — ensure mocks/fixtures align with implemented control flow (test_jobs, test_openai, test_registry).

Possibly related PRs

Move response API to Celery-backed job processing #381 — modifies job-processing subsystem and Celery-backed job orchestration similar to the job lifecycle changes here.

Suggested reviewers

AkhileshNegi
nishika26
kartpop

Poem

🐇 I nudged a job into the stream,
Prompts tucked tight in a hopping dream,
OpenAI hummed, callbacks chimed true,
Enums stretched wider, logs counted two,
Tests hopped in—cheers for something new.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary change: implementing a unified LLM API v1 with multi-provider extensibility support. It directly corresponds to the main objective of this pull request.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/unified_v1

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff80246 and 12f5300.

⛔ Files ignored due to path filters (1)

backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (1)

backend/pyproject.toml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

backend/pyproject.toml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (17)

backend/app/services/llm/providers/base.py (3)

27-33: Prefer Generic or Protocol over Any for client type.

Using Any for the client parameter loses type safety benefits. Consider using a TypeVar with Generic or a Protocol to maintain type information across provider implementations.

Apply this diff to add a TypeVar:
+from typing import Any, TypeVar
+
+ClientT = TypeVar("ClientT")
+
-class BaseProvider(ABC):
+class BaseProvider(ABC, Generic[ClientT]):
     """Abstract base class for LLM providers.
     ...
     """
 
-    def __init__(self, client: Any):
+    def __init__(self, client: ClientT):
         """Initialize the provider with client.
 
         Args:
-            client: Provider-specific client (e.g., OpenAI, Anthropic client)
+            client: Provider-specific client
         """
         self.client = client
35-60: Consider using exception-based error handling instead of tuple returns.

The tuple[LLMCallResponse | None, str | None] pattern is error-prone and requires callers to manually check both return values. Consider raising exceptions for errors and returning LLMCallResponse directly, or use a Result type.

Example alternative:
@abstractmethod
def execute(self, request: LLMCallRequest) -> LLMCallResponse:
    """Execute an LLM call using the provider.
    
    Raises:
        ProviderError: If the API call fails
        ValidationError: If request validation fails
    """
61-67: Provider name derivation relies on fragile naming convention.

The get_provider_name() method assumes all provider classes end with "Provider" suffix. If a class doesn't follow this convention, the result will be unexpected. Consider making this an abstract property or using explicit configuration.

Alternative approach:
@property
@abstractmethod
def provider_name(self) -> str:
    """Get the name of the provider."""
    ...

backend/app/models/llm/response.py (2)

22-28: Consider adding validation constraints to numeric fields.

Token count fields should have non-negative constraints to catch invalid API responses early.

Apply this diff:
+from sqlmodel import SQLModel, Field
+
 class LLMCallResponse(SQLModel):
     ...
-    input_tokens: int
-    output_tokens: int
-    total_tokens: int
+    input_tokens: int = Field(ge=0)
+    output_tokens: int = Field(ge=0)
+    total_tokens: int = Field(ge=0)
22-22: Consider using Literal or Enum for status field.

The status field is currently a plain str, which allows any value. Using a Literal type or Enum would provide better type safety and validation.

Example:
from typing import Literal

status: Literal["success", "error", "pending"]

backend/app/services/llm/orchestrator.py (1)

61-64: Sanitize unexpected-error returns to avoid leaking internals

Returning str(e) can expose internal details. Log full exception with stack, but return a generic message.

-    except Exception as e:
-        error_message = f"Unexpected error in LLM service: {str(e)}"
-        logger.error(f"[execute_llm_call] {error_message}", exc_info=True)
-        return None, error_message
+    except Exception as e:
+        logger.error("[execute_llm_call] Unexpected error in LLM service", exc_info=True)
+        return None, "Unexpected error in LLM service"

backend/app/services/llm/providers/factory.py (1)

35-36: Normalize provider names and enable dynamic registration

Make provider matching case-insensitive and allow runtime registration for extensions.

 class ProviderFactory:
@@
-    def create_provider(cls, provider_type: str, client: Any) -> BaseProvider:
+    def create_provider(cls, provider_type: str, client: Any) -> BaseProvider:
         """Create a provider instance based on the provider type.
@@
-        provider_class = cls._PROVIDERS.get(provider_type)
+        normalized = (provider_type or "").strip().lower()
+        provider_class = cls._PROVIDERS.get(normalized)
@@
-        logger.info(f"[ProviderFactory] Creating {provider_type} provider instance")
+        logger.info(f"[ProviderFactory] Creating {normalized} provider instance")
         return provider_class(client=client)
@@
     def get_supported_providers(cls) -> list[str]:
         """Get list of supported provider types.
@@
         return list(cls._PROVIDERS.keys())
+
+    @classmethod
+    def register_provider(cls, name: str, provider_cls: type[BaseProvider]) -> None:
+        """Register a provider at runtime (useful for plugins/tests)."""
+        cls._PROVIDERS[name.strip().lower()] = provider_cls

Also applies to: 48-59, 60-67

backend/app/services/llm/__init__.py (1)

12-13: Avoid side‑effect imports; re‑export explicitly if needed.

Importing app.services.llm.specs for side effects is surprising. If you intend to expose OpenAISpec (or others), explicitly import and add to all, or drop this import if unused.
-# Initialize model specs on module import
-import app.services.llm.specs  # noqa: F401
+# from app.services.llm.specs import OpenAISpec
+# __all__.append("OpenAISpec")

backend/app/models/llm/config.py (2)

28-33: Constrain provider type to prevent typos.

Use an Enum or Literal for provider (e.g., "openai") to fail fast on invalid values and align with ProviderFactory.get_supported_providers().
-from sqlmodel import SQLModel
+from enum import Enum
+from sqlmodel import SQLModel
+
+class Provider(str, Enum):
+    openai = "openai"

 class LLMModelSpec(SQLModel):
-    provider: str = "openai"
+    provider: Provider = Provider.openai
37-51: Optional: add basic bounds at this layer.

You already enforce ranges in OpenAISpec; adding min/max for max_tokens/top_p here provides earlier feedback when the provider is swapped.

backend/app/services/llm/providers/openai.py (2)

69-87: Make message extraction resilient and collect all text parts.

Current logic returns only the first text block; if output contains multiple text items or different shapes, result may be empty.

-        # Find the first ResponseOutputMessage in the output
-        for item in output:
+        texts = []
+        for item in output:
             # Check if it's a message type (has 'role' and 'content' attributes)
             if hasattr(item, "type") and item.type == "message":
                 if hasattr(item, "content"):
                     # Content is a list of content items
                     if isinstance(item.content, list) and len(item.content) > 0:
-                        # Get the first text content
-                        first_content = item.content[0]
-                        if hasattr(first_content, "text"):
-                            return first_content.text
-                        elif hasattr(first_content, "type") and first_content.type == "text":
-                            return getattr(first_content, "text", "")
-                return ""
+                        for c in item.content:
+                            if hasattr(c, "text"):
+                                texts.append(c.text)
+                            elif hasattr(c, "type") and c.type == "text":
+                                t = getattr(c, "text", "")
+                                if t:
+                                    texts.append(t)
+        if texts:
+            return "\n".join(texts)
 
         logger.warning(
             f"[OpenAIProvider] No message found in output array with {len(output)} items"
         )
         return ""

127-129: Guard against missing usage fields.

usage can be None; default to 0s to avoid AttributeError.

-                input_tokens=response.usage.input_tokens,
-                output_tokens=response.usage.output_tokens,
-                total_tokens=response.usage.total_tokens,
+                input_tokens=getattr(getattr(response, "usage", None), "input_tokens", 0),
+                output_tokens=getattr(getattr(response, "usage", None), "output_tokens", 0),
+                total_tokens=getattr(getattr(response, "usage", None), "total_tokens", 0),

backend/app/services/llm/jobs.py (5)

28-37: Persist Celery task_id to the job record.

Store task_id to correlate job↔task and aid ops.

     try:
         task_id = start_high_priority_job(
@@
         )
     except Exception as e:
@@
         )
 
-    logger.info(
+    # Persist task_id for observability
+    job_crud.update(job_id=job.id, job_update=JobUpdate(task_id=task_id))
+
+    logger.info(
         f"[start_job] Job scheduled for LLM call | job_id={job.id}, project_id={project_id}, task_id={task_id}"
     )

Also applies to: 42-46

62-70: Wrap request parsing/logging in try to ensure FAILED status on early errors.

If model parsing/logging fails, the job never moves to FAILED.

-def execute_job(
+def execute_job(
     request_data: dict,
@@
 ) -> LLMCallResponse | None:
-    """Celery task to process an LLM request asynchronously."""
-    request = LLMCallRequest(**request_data)
-    job_id_uuid = UUID(job_id)
-
-    logger.info(
-        f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, "
-        f"provider={request.llm.provider}, model={request.llm.llm_model_spec.model}"
-    )
-
-    try:
+    """Celery task to process an LLM request asynchronously."""
+    try:
+        request = LLMCallRequest(**request_data)
+        job_id_uuid = UUID(job_id)
+        logger.info(
+            f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, "
+            f"provider={request.llm.llm_model_spec.provider}, model={request.llm.llm_model_spec.model}"
+        )

Also applies to: 71-79

60-61: Silence Ruff ARG001 for unused Celery task arg.

Prefix with underscore or remove if not required by signature.

-    task_instance,
+    _task_instance,

86-93: Minor: avoid redundant JobCrud re-instantiation.

You already have job_crud in the same context; reuse it.

-                job_crud = JobCrud(session=session)
-                job_crud.update(
+                job_crud.update(
                     job_id=job_id_uuid,
                     job_update=JobUpdate(
                         status=JobStatus.FAILED, error_message=error_msg
                     ),
                 )

79-83: Define client before conditional to satisfy analyzers.

Initialize client to None before the provider branch; keeps scope clear.

-            provider_type = request.llm.llm_model_spec.provider
+            provider_type = request.llm.llm_model_spec.provider
+            client = None
             if provider_type == "openai":
                 client = get_openai_client(session, organization_id, project_id)
             else:
@@
-        response, error = execute_llm_call(
+        response, error = execute_llm_call(
             request=request,
             client=client,
         )

Also applies to: 95-98

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b7b640 and e76e2c8.

📒 Files selected for processing (18)

backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py (1 hunks)
backend/app/api/main.py (2 hunks)
backend/app/api/routes/llm.py (1 hunks)
backend/app/models/__init__.py (1 hunks)
backend/app/models/job.py (1 hunks)
backend/app/models/llm/__init__.py (1 hunks)
backend/app/models/llm/config.py (1 hunks)
backend/app/models/llm/request.py (1 hunks)
backend/app/models/llm/response.py (1 hunks)
backend/app/services/llm/__init__.py (1 hunks)
backend/app/services/llm/jobs.py (1 hunks)
backend/app/services/llm/orchestrator.py (1 hunks)
backend/app/services/llm/providers/__init__.py (1 hunks)
backend/app/services/llm/providers/base.py (1 hunks)
backend/app/services/llm/providers/factory.py (1 hunks)
backend/app/services/llm/providers/openai.py (1 hunks)
backend/app/services/llm/specs/__init__.py (1 hunks)
backend/app/services/llm/specs/openai.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

backend/app/models/job.py
backend/app/services/llm/orchestrator.py
backend/app/services/llm/providers/__init__.py
backend/app/services/llm/providers/factory.py
backend/app/services/llm/__init__.py
backend/app/api/main.py
backend/app/models/__init__.py
backend/app/models/llm/response.py
backend/app/services/llm/providers/base.py
backend/app/services/llm/specs/__init__.py
backend/app/api/routes/llm.py
backend/app/services/llm/providers/openai.py
backend/app/models/llm/__init__.py
backend/app/services/llm/jobs.py
backend/app/models/llm/config.py
backend/app/services/llm/specs/openai.py
backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py
backend/app/models/llm/request.py

backend/app/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Define SQLModel entities (database tables and domain objects) in backend/app/models/

Files:

backend/app/models/job.py
backend/app/models/__init__.py
backend/app/models/llm/response.py
backend/app/models/llm/__init__.py
backend/app/models/llm/config.py
backend/app/models/llm/request.py

backend/app/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement business logic services under backend/app/services/

Files:

backend/app/services/llm/orchestrator.py
backend/app/services/llm/providers/__init__.py
backend/app/services/llm/providers/factory.py
backend/app/services/llm/__init__.py
backend/app/services/llm/providers/base.py
backend/app/services/llm/specs/__init__.py
backend/app/services/llm/providers/openai.py
backend/app/services/llm/jobs.py
backend/app/services/llm/specs/openai.py

backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Expose FastAPI REST endpoints under backend/app/api/ organized by domain

Files:

backend/app/api/main.py
backend/app/api/routes/llm.py

🧬 Code graph analysis (13)

backend/app/services/llm/orchestrator.py (4)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/providers/factory.py (2)

ProviderFactory (16-67)

create_provider (35-58)

backend/app/services/llm/providers/base.py (1)

execute (36-59)

backend/app/services/llm/providers/__init__.py (3)

backend/app/services/llm/providers/base.py (1)

BaseProvider (14-67)

backend/app/services/llm/providers/factory.py (1)

ProviderFactory (16-67)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (25-157)

backend/app/services/llm/providers/factory.py (2)

backend/app/services/llm/providers/base.py (1)

BaseProvider (14-67)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (25-157)

backend/app/services/llm/__init__.py (4)

backend/app/services/llm/orchestrator.py (1)

execute_llm_call (17-64)

backend/app/services/llm/providers/base.py (1)

BaseProvider (14-67)

backend/app/services/llm/providers/factory.py (1)

ProviderFactory (16-67)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (25-157)

backend/app/models/__init__.py (3)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/models/llm/config.py (2)

LLMConfig (37-51)

LLMModelSpec (12-34)

backend/app/services/llm/providers/base.py (3)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/providers/openai.py (1)

execute (89-157)

backend/app/services/llm/specs/__init__.py (1)

backend/app/services/llm/specs/openai.py (1)

OpenAISpec (15-197)

backend/app/api/routes/llm.py (4)

backend/app/models/auth.py (1)

AuthContext (18-21)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/services/llm/jobs.py (1)

start_job (20-51)

backend/app/utils.py (2)

APIResponse (29-53)

success_response (36-39)

backend/app/services/llm/providers/openai.py (4)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/providers/base.py (2)

BaseProvider (14-67)

execute (36-59)

backend/app/services/llm/specs/openai.py (3)

OpenAISpec (15-197)

from_llm_request (173-197)

to_api_params (101-170)

backend/app/models/llm/__init__.py (3)

backend/app/models/llm/config.py (2)

LLMConfig (37-51)

LLMModelSpec (12-34)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/jobs.py (7)

backend/app/celery/utils.py (1)

start_high_priority_job (18-43)

backend/app/crud/jobs.py (1)

JobCrud (11-42)

backend/app/models/job.py (3)

JobType (16-18)

JobStatus (9-13)

JobUpdate (47-50)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/orchestrator.py (1)

execute_llm_call (17-64)

backend/app/utils.py (1)

get_openai_client (175-205)

backend/app/services/llm/specs/openai.py (1)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/request.py (1)

backend/app/models/llm/config.py (1)

LLMConfig (37-51)

🪛 Ruff (0.14.1)

backend/app/api/routes/llm.py

25-25: Local variable job_id is assigned to but never used

Remove assignment to unused variable job_id

(F841)

backend/app/services/llm/jobs.py

60-60: Unused function argument: task_instance

(ARG001)

🔇 Additional comments (11)

backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py (2)

19-21: LGTM! Migration safely adds enum value.

The migration correctly uses IF NOT EXISTS to make the enum addition idempotent, which is the recommended approach for Postgres enum extensions.

23-24: Empty downgrade is acceptable for enum additions.

Removing enum values from Postgres is risky if any rows reference them, so leaving downgrade as a no-op is a reasonable choice. If you need to support rollback, consider checking for usage before removal.

backend/app/models/job.py (1)

16-18: LGTM! JobType enum extended correctly.

The new LLM_API enum member aligns with the Alembic migration and follows the existing pattern.

backend/app/api/main.py (1)

10-10: LGTM! LLM router wired correctly.

The new LLM router follows the established pattern for including routers in the API.

Also applies to: 35-35

backend/app/services/llm/specs/__init__.py (1)

1-3: LGTM! Clean module exports.

The re-export follows standard Python packaging conventions.

backend/app/services/llm/providers/__init__.py (1)

1-15: LGTM! Well-documented module exports.

The provider module exports follow best practices with clear documentation and appropriate public API surface.

backend/app/models/__init__.py (1)

51-56: Public API re-export looks good

Re-exporting LLM models here improves discoverability and import ergonomics. LGTM.

backend/app/models/llm/__init__.py (1)

1-21: Clean aggregation of LLM models

Well-scoped __all__ with explicit exports; clear and maintainable.

backend/app/services/llm/__init__.py (1)

15-22: Public surface looks coherent.

execute_llm_call and provider types are neatly exported. No blockers.

backend/app/services/llm/specs/openai.py (2)

172-197: from_llm_request mapping is clean.

max_tokens→max_output_tokens mapping and vector store fields are correctly bridged.

152-161: Code implementation is correct per OpenAI Responses API specification.

The schema matches exactly: "tools" array with file_search type, vector_store_ids as a list, optional max_num_results, and include=["file_search_call.results"] is a valid include value that returns file_search tool call results. No changes needed.

backend/app/api/routes/llm.py

backend/app/models/llm/request.py

backend/app/models/llm/response.py

backend/app/services/llm/jobs.py

backend/app/services/llm/providers/openai.py

codecov · 2025-10-22T06:41:51Z

Codecov Report

❌ Patch coverage is 98.98990% with 6 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
backend/app/services/llm/providers/registry.py	90.32%	3 Missing ⚠️
backend/app/services/llm/providers/base.py	81.81%	2 Missing ⚠️
backend/app/models/llm/request.py	96.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

… registry for provider instantiation, and update OpenAI provider execution logic.

…y comments, and clarify provider responsibilities.

…update response handling

…m_call, reorganize imports, and enhance logging in OpenAIProvider.

…d fix logging format in OpenAIProvider.

…d ConversationConfig, update query parameters, and enhance conversation management logic.

…CallResponse to use Usage, and adjust OpenAIProvider to construct Usage correctly.

…ort conversation ID

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (2)

backend/app/services/llm/providers/openai.py (1)

94-103: Use correct OpenAI SDK v2 exception classes.

The code catches openai.OpenAIError which does not exist in OpenAI Python SDK v2. The correct base exception is openai.APIError, with subclasses including APIConnectionError, APITimeoutError, APIStatusError, RateLimitError, and InternalServerError.

This will cause API errors to fall through to the generic Exception handler.

Update the imports and exception handler:
-import openai
+from openai import APIError
 from openai import OpenAI
@@
-        except openai.OpenAIError as e:
+        except APIError as e:
             # imported here to avoid circular imports
             from app.utils import handle_openai_error
Alternatively, catch specific subclasses if granular handling is needed:
-import openai
+from openai import APIError, APIConnectionError, RateLimitError, APIStatusError
@@
-        except openai.OpenAIError as e:
+        except (RateLimitError, APIConnectionError, APIStatusError, APIError) as e:

backend/app/api/routes/llm.py (1)

33-41: Return 202 with job_id, guard missing project/org, and align response_model

Return Accepted with a pollable job_id; hard‑fail if user context lacks project/org. Also fix F541 by removing the unnecessary f‑string. This mirrors prior feedback.

Apply this diff:

-from fastapi import APIRouter
+from fastapi import APIRouter, HTTPException
@@
-@router.post(
-    "/llm/call",
-    response_model=APIResponse[Message],
-    callbacks=llm_callback_router.routes,
-)
-async def llm_call(
-    _current_user: AuthContextDep, _session: SessionDep, request: LLMCallRequest
-):
+@router.post(
+    "/llm/call",
+    status_code=202,
+    response_model=APIResponse[dict[str, str]],
+    callbacks=llm_callback_router.routes,
+)
+async def llm_call(
+    _current_user: AuthContextDep, _session: SessionDep, request: LLMCallRequest
+) -> APIResponse[dict[str, str]]:
@@
-    project_id = _current_user.project.id
-    organization_id = _current_user.organization.id
+    if _current_user.project is None or _current_user.organization is None:
+        raise HTTPException(status_code=403, detail="Project and organization context required")
+    project_id = _current_user.project.id
+    organization_id = _current_user.organization.id
@@
-    start_job(
+    job_id = start_job(
         db=_session,
         request=request,
         project_id=project_id,
         organization_id=organization_id,
     )
@@
-    return APIResponse.success_response(
-        data=Message(
-            message=f"Your response is being generated and will be delivered via callback."
-        ),
-    )
+    logger.info(
+        f"[llm_call] Scheduled LLM job | job_id={job_id} project_id={project_id} org_id={organization_id}"
+    )
+    return APIResponse.success_response(
+        data={
+            "status": "processing",
+            "message": "LLM call job scheduled",
+            "job_id": str(job_id),
+        },
+    )

Also applies to: 44-45, 47-52, 54-58, 3-3

🧹 Nitpick comments (5)

backend/app/services/llm/providers/registry.py (1)

50-58: Simplify client construction logic - redundant provider validation.

Lines 50-58 contain redundant validation. The provider type is already validated against PROVIDER_REGISTRY at lines 26-36. The else clause (lines 54-58) is unreachable because any unsupported provider would have raised ValueError earlier.

Apply this diff to remove redundant validation:

-    if provider_type == "openai":
-        if "api_key" not in credentials:
-            raise ValueError("OpenAI credentials not configured for this project.")
-        client = OpenAI(api_key=credentials["api_key"])
-    else:
-        logger.error(
-            f"[get_llm_provider] Unsupported provider type requested: {provider_type}"
-        )
-        raise ValueError(f"Provider '{provider_type}' is not supported.")
+    if provider_type == "openai":
+        if "api_key" not in credentials:
+            raise ValueError("OpenAI credentials not configured for this project.")
+        client = OpenAI(api_key=credentials["api_key"])
+    # Future providers can add their client construction here

backend/app/services/llm/providers/openai.py (1)

22-29: Remove redundant client assignment.

Line 29 is redundant because super().__init__(client) at line 28 already assigns self.client = client as shown in the BaseProvider class.

Apply this diff:
     def __init__(self, client: OpenAI):
         """Initialize OpenAI provider with client.

         Args:
             client: OpenAI client instance
         """
         super().__init__(client)
-        self.client = client

backend/app/models/llm/request.py (1)

23-27: Consider raising a validation error instead of silently modifying auto_create.

The validator silently disables auto_create when both id and auto_create=True are provided. This could mask user intent or configuration errors. Consider explicitly rejecting conflicting inputs with a clear error message.

Apply this diff:
     @model_validator(mode="after")
     def validate_conversation_logic(self):
         if self.id and self.auto_create:
-            self.auto_create = False
+            raise ValueError(
+                "Cannot set both 'id' and 'auto_create=True'. "
+                "Use 'id' to continue an existing conversation, "
+                "or 'auto_create=True' to create a new one."
+            )
         return self

backend/app/services/llm/jobs.py (1)

53-74: Persist FAILED before attempting callback; don’t raise on callback error

Update DB first to avoid losing status if callback send fails, then best‑effort callback with logging.

Apply this diff:

 def handle_job_error(
@@
-    with Session(engine) as session:
-        job_crud = JobCrud(session=session)
-
-        if callback_url:
-            send_callback(
-                callback_url=callback_url,
-                data=callback_response.model_dump(),
-            )
-
-        job_crud.update(
+    with Session(engine) as session:
+        job_crud = JobCrud(session=session)
+        # Persist failure first
+        job_crud.update(
             job_id=job_id,
             job_update=JobUpdate(
                 status=JobStatus.FAILED,
                 error_message=callback_response.error,
             ),
         )
+        # Best-effort callback
+        if callback_url:
+            try:
+                send_callback(
+                    callback_url=callback_url,
+                    data=callback_response.model_dump(),
+                )
+            except Exception as e:
+                logger.error(
+                    f"[handle_job_error] Callback delivery failed | job_id={job_id}, error={e}",
+                    exc_info=True,
+                )

backend/app/tests/services/llm/test_jobs.py (1)

273-286: Add a test: provider succeeds but callback send fails should still mark SUCCESS

Prevents regressions from treating callback errors as job failures.

Proposed test to append:

@@ class TestExecuteJob:
     def test_success_with_callback(self, db, job_env, job_for_execution, request_data):
         """Successful execution with callback."""
@@
         assert job_for_execution.status == JobStatus.SUCCESS
+
+    def test_callback_failure_does_not_flip_success(self, db, job_env, job_for_execution, request_data):
+        """Provider succeeds; callback send fails => job remains SUCCESS."""
+        env = job_env
+        request_data["callback_url"] = "https://example.com/callback"
+        env["provider"].execute.return_value = (env["mock_llm_response"], None)
+        env["send_callback"].side_effect = Exception("Webhook down")
+        result = self._execute_job(job_for_execution, db, request_data)
+        assert result["success"]
+        db.refresh(job_for_execution)
+        assert job_for_execution.status == JobStatus.SUCCESS

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e76e2c8 and 9a93cc8.

⛔ Files ignored due to path filters (1)

backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (17)

backend/app/api/routes/llm.py (1 hunks)
backend/app/models/__init__.py (1 hunks)
backend/app/models/llm/__init__.py (1 hunks)
backend/app/models/llm/request.py (1 hunks)
backend/app/models/llm/response.py (1 hunks)
backend/app/services/llm/__init__.py (1 hunks)
backend/app/services/llm/jobs.py (1 hunks)
backend/app/services/llm/providers/__init__.py (1 hunks)
backend/app/services/llm/providers/base.py (1 hunks)
backend/app/services/llm/providers/openai.py (1 hunks)
backend/app/services/llm/providers/registry.py (1 hunks)
backend/app/tests/api/routes/test_llm.py (1 hunks)
backend/app/tests/services/llm/providers/test_openai.py (1 hunks)
backend/app/tests/services/llm/providers/test_registry.py (1 hunks)
backend/app/tests/services/llm/test_jobs.py (1 hunks)
backend/app/tests/utils/openai.py (3 hunks)
backend/pyproject.toml (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

backend/app/models/llm/response.py
backend/app/models/init.py
backend/app/services/llm/providers/init.py
backend/app/models/llm/init.py
backend/app/services/llm/init.py

🧰 Additional context used

📓 Path-based instructions (4)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

backend/app/tests/api/routes/test_llm.py
backend/app/services/llm/providers/registry.py
backend/app/models/llm/request.py
backend/app/tests/services/llm/providers/test_openai.py
backend/app/services/llm/providers/openai.py
backend/app/services/llm/providers/base.py
backend/app/tests/services/llm/providers/test_registry.py
backend/app/tests/services/llm/test_jobs.py
backend/app/api/routes/llm.py
backend/app/services/llm/jobs.py
backend/app/tests/utils/openai.py

backend/app/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement business logic services under backend/app/services/

Files:

backend/app/services/llm/providers/registry.py
backend/app/services/llm/providers/openai.py
backend/app/services/llm/providers/base.py
backend/app/services/llm/jobs.py

backend/app/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Define SQLModel entities (database tables and domain objects) in backend/app/models/

Files:

backend/app/models/llm/request.py

backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Expose FastAPI REST endpoints under backend/app/api/ organized by domain

Files:

backend/app/api/routes/llm.py

🧬 Code graph analysis (9)

backend/app/tests/api/routes/test_llm.py (2)

backend/app/models/llm/request.py (4)

LLMCallRequest (66-81)

QueryParams (31-42)

LLMCallConfig (57-63)

CompletionConfig (45-54)

backend/app/tests/conftest.py (1)

user_api_key_header (90-92)

backend/app/services/llm/providers/registry.py (3)

backend/app/crud/credentials.py (1)

get_provider_credential (121-159)

backend/app/services/llm/providers/base.py (1)

BaseProvider (13-63)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (21-108)

backend/app/tests/services/llm/providers/test_openai.py (3)

backend/app/models/llm/request.py (3)

CompletionConfig (45-54)

QueryParams (31-42)

ConversationConfig (7-27)

backend/app/services/llm/providers/openai.py (2)

OpenAIProvider (21-108)

execute (31-108)

backend/app/tests/utils/openai.py (1)

mock_openai_response (57-111)

backend/app/services/llm/providers/openai.py (3)

backend/app/models/llm/request.py (2)

CompletionConfig (45-54)

QueryParams (31-42)

backend/app/models/llm/response.py (4)

LLMCallResponse (42-52)

LLMOutput (15-18)

LLMResponse (21-39)

Usage (9-12)

backend/app/services/llm/providers/base.py (2)

BaseProvider (13-63)

execute (35-55)

backend/app/services/llm/providers/base.py (3)

backend/app/models/llm/request.py (2)

CompletionConfig (45-54)

QueryParams (31-42)

backend/app/models/llm/response.py (1)

LLMCallResponse (42-52)

backend/app/services/llm/providers/openai.py (1)

execute (31-108)

backend/app/tests/services/llm/providers/test_registry.py (5)

backend/app/services/llm/providers/base.py (1)

BaseProvider (13-63)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (21-108)

backend/app/services/llm/providers/registry.py (2)

get_llm_provider (23-60)

get_supported_providers (63-69)

backend/app/tests/utils/utils.py (1)

get_project (54-73)

backend/app/tests/conftest.py (1)

db (28-45)

backend/app/tests/services/llm/test_jobs.py (8)

backend/app/crud/jobs.py (1)

JobCrud (11-42)

backend/app/utils.py (2)

APIResponse (29-53)

failure_response (42-53)

backend/app/models/job.py (2)

JobStatus (9-13)

JobType (16-18)

backend/app/models/llm/request.py (4)

LLMCallRequest (66-81)

CompletionConfig (45-54)

QueryParams (31-42)

LLMCallConfig (57-63)

backend/app/models/llm/response.py (4)

LLMCallResponse (42-52)

LLMResponse (21-39)

LLMOutput (15-18)

Usage (9-12)

backend/app/services/llm/jobs.py (3)

start_job (19-50)

handle_job_error (53-76)

execute_job (79-162)

backend/app/tests/utils/utils.py (1)

get_project (54-73)

backend/app/tests/conftest.py (1)

db (28-45)

backend/app/api/routes/llm.py (5)

backend/app/models/llm/request.py (1)

LLMCallRequest (66-81)

backend/app/models/llm/response.py (1)

LLMCallResponse (42-52)

backend/app/models/message.py (1)

Message (5-6)

backend/app/services/llm/jobs.py (1)

start_job (19-50)

backend/app/utils.py (2)

APIResponse (29-53)

success_response (36-39)

backend/app/services/llm/jobs.py (8)

backend/app/crud/jobs.py (1)

JobCrud (11-42)

backend/app/models/job.py (3)

JobStatus (9-13)

JobType (16-18)

JobUpdate (47-50)

backend/app/models/llm/request.py (1)

LLMCallRequest (66-81)

backend/app/models/llm/response.py (1)

LLMCallResponse (42-52)

backend/app/utils.py (3)

APIResponse (29-53)

success_response (36-39)

failure_response (42-53)

backend/app/celery/utils.py (1)

start_high_priority_job (18-43)

backend/app/services/llm/providers/registry.py (1)

get_llm_provider (23-60)

backend/app/services/llm/providers/openai.py (1)

execute (31-108)

🪛 Ruff (0.14.2)

backend/app/api/routes/llm.py

56-56: f-string without any placeholders

Remove extraneous f prefix

(F541)

backend/app/services/llm/jobs.py

85-85: Unused function argument: task_instance

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (14)

backend/app/tests/utils/openai.py (1)

57-111: LGTM! Conversation support added correctly.

The addition of optional conversation_id parameter and conditional conversation object creation is well-implemented. The mock correctly reflects OpenAI's response structure when conversations are present.

backend/app/tests/api/routes/test_llm.py (1)

7-38: LGTM! Test validates async LLM call flow correctly.

The test properly mocks the background job, constructs a valid request payload, and asserts the expected async response behavior.

backend/app/services/llm/providers/registry.py (2)

15-20: LGTM! Provider registry structure is clean and extensible.

The registry design allows easy addition of new providers. The type annotation correctly indicates a mapping from provider identifiers to provider classes.

63-69: LGTM! Helper function is straightforward.

The function correctly exposes the list of supported provider types from the registry.

backend/app/services/llm/providers/openai.py (1)

40-57: LGTM! Conversation handling logic is correct.

The three-way branching correctly handles:

Existing conversation ID (use it)

Auto-create (create new conversation)

No conversation config (remove any stale conversation params)

This aligns well with the ConversationConfig model validation.

backend/app/services/llm/providers/base.py (1)

13-63: LGTM! Clean abstract base class design.

The BaseProvider class provides a well-structured contract for LLM providers with:

Clear method signatures with type hints (per coding guidelines)

Proper use of @AbstractMethod decorator

Helpful docstrings

Utility method for provider name derivation

backend/app/models/llm/request.py (3)

31-42: LGTM! QueryParams model is well-structured.

The model has proper type hints, field constraints (min_length), and descriptive documentation per coding guidelines.

45-54: LGTM! CompletionConfig provides flexible provider params.

The use of Literal["openai"] for type safety and dict[str, Any] for provider-specific parameters is a good balance between type safety and flexibility.

57-81: LGTM! Request models follow coding guidelines.

The models demonstrate:

Proper type hints throughout (Python 3.11+ per guidelines)

Pydantic's HttpUrl for URL validation

Descriptive Field definitions

Forward-looking design with placeholder comments for future extensions

backend/app/tests/services/llm/providers/test_openai.py (4)

18-42: LGTM! Test fixtures are well-structured.

The fixtures provide clean, reusable test dependencies with proper scoping.

44-123: LGTM! Comprehensive testing of conversation handling.

The tests thoroughly validate:

Basic execution without conversations

Using existing conversation IDs

Auto-creating conversations

Each test verifies both the response structure and the correct parameters passed to the OpenAI client.

145-192: LGTM! Error handling tests are comprehensive.

The tests correctly use openai.APIError (SDK v2) and cover all error paths including TypeError, API errors, and generic exceptions.

194-253: LGTM! Edge case testing is thorough.

The tests validate important edge cases:

ConversationConfig present but inactive

Correct parameter merging from config

Cleanup of stale conversation parameters

These tests ensure robust behavior across various input scenarios.

backend/app/tests/services/llm/providers/test_registry.py (1)

23-34: LGTM: registry shape and error cases well covered

Coverage for provider presence, subclassing, credential paths, and invalid cases looks solid.

Also applies to: 50-77, 90-106, 107-124

backend/app/api/routes/llm.py

coderabbitai · 2025-10-29T06:17:53Z

backend/app/services/llm/jobs.py

+            send_callback(
+                callback_url=callback_url,
+                data=callback_response.model_dump(),
+            )


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Mitigate SSRF risk and add timeouts/retries in send_callback

callback_url is user‑controlled. Ensure send_callback enforces:

scheme allowlist (https), DNS/IP egress safeguards (no RFC1918, localhost), and no redirects to disallowed hosts,

short connect/read timeouts and limited retries,

payload size limits and proper auth/signature.

If send_callback already handles this, point to its implementation or docs; otherwise, I can draft a hardened helper.

Also applies to: 129-134

I'll verify the SSRF risk mitigation by examining the send_callback implementation and its call sites.

🏁 Script executed:

#!/bin/bash # Find send_callback function definition and call sites echo "=== Locating send_callback function ===" rg -n "def send_callback|send_callback\(" backend/app/services/llm/jobs.py -A 2 -B 2 echo -e "\n=== Searching for send_callback definition across codebase ===" rg -n "def send_callback" --type=py

Length of output: 637

🏁 Script executed:

#!/bin/bash # Examine the jobs.py file around the specified lines echo "=== Lines 60-75 in jobs.py ===" sed -n '60,75p' backend/app/services/llm/jobs.py echo -e "\n=== Lines 125-140 in jobs.py ===" sed -n '125,140p' backend/app/services/llm/jobs.py

Length of output: 1054

🏁 Script executed:

#!/bin/bash # Search for send_callback imports in jobs.py echo "=== Imports in jobs.py ===" head -50 backend/app/services/llm/jobs.py | grep -n "import\|from"

Length of output: 517

🏁 Script executed:

#!/bin/bash # Find all files that may contain send_callback echo "=== Files containing send_callback ===" rg -l "send_callback" --type=py

Length of output: 258

🏁 Script executed:

#!/bin/bash # Look at the full send_callback implementation if found echo "=== Full send_callback implementation ===" rg -n "def send_callback" --type=py -A 30

Length of output: 4152

🏁 Script executed:

#!/bin/bash # Check for any URL validation utilities or helper functions echo "=== URL validation related functions ===" rg -n "scheme|allowlist|redirect|ssrf|url.*validat|Request.*timeout" backend/app/services/llm/ --type=py -i

Length of output: 40

🏁 Script executed:

#!/bin/bash # Look for httpx or requests configurations echo "=== HTTP client configurations ===" rg -n "httpx|requests\.|timeout|verify" backend/app/services/llm/jobs.py -B 2 -A 2

Length of output: 34

🏁 Script executed:

#!/bin/bash # Check imports and module structure cat backend/app/services/llm/jobs.py | head -100

Length of output: 3021

🏁 Script executed:

#!/bin/bash # Search in the entire codebase for where callback_url is defined/used echo "=== callback_url definitions and uses ===" rg -n "callback_url" backend/app/services/llm/jobs.py -B 3 -A 3

Length of output: 1385

🌐 Web query:

What are current best practices for preventing SSRF attacks in Python HTTP clients (httpx, requests)?

💡 Result:

Short checklist of current best practices for preventing SSRF when using Python HTTP clients (requests, httpx):

Enforce an allowlist of destinations (hostnames or exact URLs) whenever possible — prefer allowlist over denylist. [1][2]

Resolve hostnames to IPs and block private/loopback/169.254/metadata and other internal ranges before connecting (defend against DNS rebinding / alternate encodings). Check all A/AAAA results and re-check after redirects. [1][2]

Stop automatic redirect-following (or handle redirects manually): set allow_redirects=False (requests) / follow_redirects=False or inspect Response in httpx and validate target before continuing. Beware older urllib3/clients where PoolManager-level redirect disabling was bypassed — keep libraries patched. [2][6][8]

Restrict URL schemes & ports: accept only http(s) (or the minimal needed), and validate port numbers (disallow high-risk ports like 22/23/3306 unless explicitly required). [1][2]

Use a safe resolver / transport layer that enforces IP filters (custom DNS resolver or transport hooks in httpx; or outgoing proxy that performs filtering). Prefer a vetted library or middleware (e.g., community “hardened” wrappers) if available. [2][7]

Do not allow user-supplied credentials or userinfo in URLs (avoid leaking Proxy-Authorization or credentials via redirects). Keep requests/urllib3 up-to-date to avoid known CVEs. [6][9]

Network-level defenses: egress firewall rules / security groups that deny-by-default to internal services and cloud metadata endpoints (169.254.169.254). Segment services that perform outbound fetches. [1][2]

Principle of least privilege & proxying: perform user-driven fetches through a server-side proxy/service that enforces allowlists, rate limits, timeouts and content-type checks rather than making raw client calls directly. [1][4]

Hard timeouts, connection limits, and response-size limits to reduce abuse/DoS from SSRF flows. Log and alert unusual outbound requests for monitoring. [1][4]

Test & scan: include SSRF test cases in SAST/DAST, fuzz inputs, and run runtime tests (Burp/OWASP ZAP). Keep dependencies updated and monitor advisories (urllib3/requests/httpx). [3][4][8]

References

OWASP Top 10 / SSRF prevention guidance. [1]

OWASP Proactive Controls / "Stop SSRF" guidance. [2]

OWASP API Security / SSRF mitigation patterns and testing. [3]

Rapid7 / deep-dive on SSRF mitigations and examples. [4]

Requests/urllib3 security advisories (redirect/credential leakage CVEs) and guidance to patch and disable redirects correctly. [6][8]

Community hardened libraries and wrappers for requests that implement IP filters (example projects). [7]

If you want, I can produce a short copy-pasteable snippet for requests and for httpx that implements DNS resolution + IP whitelist/block checks and safe redirect handling.

🌐 Web query:

RFC1918 private IP ranges and localhost protection best practices

💡 Result:

Summary — RFC1918 ranges

10.0.0.0/8 (10.0.0.0–10.255.255.255). [1]

172.16.0.0/12 (172.16.0.0–172.31.255.255). [1]

192.168.0.0/16 (192.168.0.0–192.168.255.255). [1]

Key operational/security points for RFC1918

Private addresses are not routable on the public Internet; plan filtering at borders to prevent leakage and ambiguous routing. [1]

Prevent RFC1918 traffic from exiting WAN interfaces (egress filter) to avoid information leakage and misrouting. [2]

When connecting multiple sites (VPNs, mergers) avoid address collisions by renumbering, using distinct subnets, translation, or choosing randomized/private allocation blocks. [1]

Use split DNS / separate authoritative zones so private addresses are not exposed in public DNS. [1]

Localhost (127.0.0.1 / ::1) protection best practices

Bind services only to loopback when they are local-only (use 127.0.0.1 or unix sockets); avoid 0.0.0.0 unless you explicitly need external access. [3][4]

Use host-based firewalls to block unwanted interfaces and explicitly allow loopback traffic only where appropriate. [2][4]

Require authentication/authorization even for local services; run services with least privilege. [3][4]

Harden developer tooling: avoid using production data on localhost, use TLS for sensitive local endpoints, and monitor for unexpected listeners. [4]

Be careful with port-forwarding or SSH GatewayPorts (it can expose loopback services remotely if enabled). [4]

Quick checklist to apply now

Audit listening sockets (ss/netstat/lsof) → ensure intended bind addresses. [3]

Add firewall rules to block RFC1918 egress on WAN and to restrict non-loopback binds. [2]

Confirm DNS split-horizon so private addresses aren’t published externally. [1]

Lock down any locally-bound admin APIs (authenticate, use TLS, require localhost-only binding). [3][4]

Sources
[1] RFC 1918 — Address Allocation for Private Internets.
[2] pfSense / Netgate docs — preventing RFC1918 traffic from exiting a WAN interface.
[3] MySQL/networking guidance — bind-address and bind-to-localhost best practice.
[4] General localhost security/developer guidance (localhost isolation, bind restrictions, SSH tunneling considerations).

Critical SSRF vulnerabilities confirmed in send_callback implementations — require hardened URL validation and consistent timeout enforcement.

The current send_callback implementations in backend/app/utils.py and backend/app/api/routes/threads.py lack essential SSRF protections:

No scheme/hostname validation: Accepts arbitrary schemes and internal IP ranges (RFC1918, localhost, 169.254.x.x metadata endpoints)

No redirect safeguards: requests.post() follows redirects by default, enabling SSRF attacks

Inconsistent timeout enforcement: app/utils.py (used by jobs.py) has timeouts; threads.py has none

Missing payload size limits and authentication: No request signing or amplification protections

Both call sites in jobs.py (lines 63–66 and 130–132) pass user-controlled callback_url without validation.

Implement a hardened helper that:

Enforces https-only scheme

Validates and blocks RFC1918 (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), localhost, and 169.254.x.x

Disables redirects (or re-validates after each redirect)

Enforces short timeouts consistently

Limits response size

(Optional) adds HMAC/signature validation

🤖 Prompt for AI Agents

In backend/app/services/llm/jobs.py around lines 63–66 and 130–132, the code sends user-controlled callback_url without SSRF protections; create and use a hardened helper in backend/app/utils.py (e.g., safe_send_callback) and replace the direct send_callback calls at these lines to call it; the helper must enforce HTTPS only, resolve the hostname and block RFC1918 ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), localhost (127.0.0.0/8 and ::1), and link-local 169.254.0.0/16, disallow requests if resolution yields any blocked IPs, disable redirects (allow_redirects=False) or validate every redirect target with the same checks, apply a short consistent timeout (e.g., connect+read), stream the response and enforce a maximum response size when reading, raise/log a clear error on validation failure, and optionally add HMAC signing of payloads for authenticity; ensure both call sites use this new helper so validation and timeouts are consistent.

backend/app/services/llm/jobs.py

backend/pyproject.toml

kartpop

might want to consider two things either in the current PR or in future PRs

rate limits for our llm/call/ endpoint per org per project
defend against SSRF (server side request forgery) before sending response on callback urls

here is a sample code

  import ipaddress
  from urllib.parse import urlparse

  from pydantic import HttpUrl


  class SSRFProtectionError(ValueError):
      """Raised when a URL fails SSRF validation."""
      pass


  def validate_callback_url_ssrf(url: HttpUrl | str | None) -> HttpUrl | str | None:
      """
      Validate callback URL against SSRF attacks.
      
      Blocks localhost, private IPs, and cloud metadata endpoints.
      """
      if url is None:
          return None

      parsed = urlparse(str(url))

      # 1. Only allow http/https
      if parsed.scheme not in ("http", "https"):
          raise SSRFProtectionError("Callback URL must use http or https")

      hostname = parsed.hostname
      if not hostname:
          raise SSRFProtectionError("Callback URL must have a valid hostname")

      # 2. Block localhost
      if hostname.lower() in ("localhost", "127.0.0.1", "::1", "0.0.0.0"):
          raise SSRFProtectionError("Callback URL cannot point to localhost")

      # 3. Block cloud metadata endpoint
      if hostname == "169.254.169.254":
          raise SSRFProtectionError("Callback URL cannot point to metadata endpoint")

      # 4. Block private IP addresses
      try:
          ip_obj = ipaddress.ip_address(hostname)
          if ip_obj.is_private or ip_obj.is_loopback or ip_obj.is_link_local:
              raise SSRFProtectionError("Callback URL cannot use private IP addresses")
      except ValueError:
          # Not an IP address, it's a domain name - that's OK
          pass

      return url

backend/app/models/llm/request.py

backend/app/services/llm/providers/registry.py

backend/app/services/llm/providers/openai.py

backend/app/services/llm/jobs.py

backend/app/models/llm/request.py

backend/pyproject.toml

backend/app/models/llm/request.py

backend/app/models/llm/response.py

backend/app/services/llm/providers/base.py

backend/app/services/llm/providers/registry.py

backend/app/services/llm/jobs.py

backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py

backend/app/tests/services/llm/providers/test_openai.py

…cription for clarity

…e tests accordingly

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

backend/app/services/llm/providers/openai.py (1)
94-103: Fix OpenAI SDK v2 exception handling.

Line 94 catches openai.OpenAIError, which does not exist in OpenAI Python SDK v2. The correct base exception class is openai.APIError, with subclasses including APIConnectionError, APITimeoutError, APIStatusError, RateLimitError, and InternalServerError.

The current code will fail to catch OpenAI API errors at runtime, causing them to fall through to the generic Exception handler at line 105, which bypasses the specialized handle_openai_error logic.

Apply this diff to fix the exception handling:
-import openai
+from openai import APIError
-        except openai.OpenAIError as e:
+        except APIError as e:
             # imported here to avoid circular imports
             from app.utils import handle_openai_error
 
             error_message = handle_openai_error(e)
             logger.error(
                 f"[OpenAIProvider.execute] OpenAI API error: {error_message}",
                 exc_info=True,
             )
             return None, error_message
Note: Catching just APIError is sufficient as all SDK exceptions inherit from it. If you need granular handling of specific error types (e.g., RateLimitError, APIConnectionError), import and catch them explicitly before the APIError catch.

🧹 Nitpick comments (3)

backend/app/services/llm/providers/__init__.py (1)
1-6: Consider consolidating imports from the same module.

The two separate from app.services.llm.providers import statements can be combined into a single import for better readability.

Apply this diff to consolidate the imports:
-from app.services.llm.providers.base import BaseProvider
-from app.services.llm.providers.openai import OpenAIProvider
-from app.services.llm.providers.registry import (
-    LLMProvider,
-    get_llm_provider,
-)
+from app.services.llm.providers.base import BaseProvider
+from app.services.llm.providers.openai import OpenAIProvider
+from app.services.llm.providers.registry import LLMProvider, get_llm_provider
Or keep them separate for logical grouping but reduce to two distinct import blocks if preferred.
backend/app/services/llm/__init__.py (1)
1-9: Consider consolidating imports from the same module.

Both import statements are from app.services.llm.providers and can be combined into a single import for consistency.

Apply this diff to consolidate the imports:
-# Providers
-from app.services.llm.providers import (
-    BaseProvider,
-    OpenAIProvider,
-)
-from app.services.llm.providers import (
-    LLMProvider,
-    get_llm_provider,
-)
+# Providers
+from app.services.llm.providers import (
+    BaseProvider,
+    LLMProvider,
+    OpenAIProvider,
+    get_llm_provider,
+)
backend/app/services/llm/providers/openai.py (1)
22-29: Remove redundant client assignment.

Line 29 redundantly assigns self.client = client after the parent class BaseProvider.__init__ already sets it at line 28.

Apply this diff:
 def __init__(self, client: OpenAI):
     """Initialize OpenAI provider with client.

     Args:
         client: OpenAI client instance
     """
     super().__init__(client)
-    self.client = client

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39e68e5 and 365ab0d.

📒 Files selected for processing (5)

backend/app/services/llm/__init__.py (1 hunks)
backend/app/services/llm/providers/__init__.py (1 hunks)
backend/app/services/llm/providers/openai.py (1 hunks)
backend/app/services/llm/providers/registry.py (1 hunks)
backend/app/tests/services/llm/providers/test_registry.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

backend/app/tests/services/llm/providers/test_registry.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

backend/app/services/llm/providers/registry.py
backend/app/services/llm/providers/__init__.py
backend/app/services/llm/providers/openai.py
backend/app/services/llm/__init__.py

backend/app/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement business logic services under backend/app/services/

Files:

backend/app/services/llm/providers/registry.py
backend/app/services/llm/providers/__init__.py
backend/app/services/llm/providers/openai.py
backend/app/services/llm/__init__.py

🧠 Learnings (1)

📚 Learning: 2025-10-08T12:05:01.317Z

Learnt from: CR
Repo: ProjectTech4DevAI/ai-platform PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-08T12:05:01.317Z
Learning: Applies to backend/app/services/response/**/*.py : Put OpenAI Responses API integration, conversation management, and job execution in backend/app/services/response/

Applied to files:

backend/app/services/llm/providers/openai.py

🧬 Code graph analysis (4)

backend/app/services/llm/providers/registry.py (3)

backend/app/crud/credentials.py (1)

get_provider_credential (121-159)

backend/app/services/llm/providers/base.py (1)

BaseProvider (13-63)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (21-108)

backend/app/services/llm/providers/__init__.py (3)

backend/app/services/llm/providers/base.py (1)

BaseProvider (13-63)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (21-108)

backend/app/services/llm/providers/registry.py (2)

LLMProvider (14-40)

get_llm_provider (43-70)

backend/app/services/llm/providers/openai.py (4)

backend/app/models/llm/request.py (2)

CompletionConfig (48-57)

QueryParams (34-45)

backend/app/models/llm/response.py (4)

LLMCallResponse (42-52)

LLMOutput (15-18)

LLMResponse (21-39)

Usage (9-12)

backend/app/services/llm/providers/base.py (2)

BaseProvider (13-63)

execute (35-55)

backend/app/tests/services/llm/providers/test_openai.py (2)

completion_config (32-37)

provider (27-29)

backend/app/services/llm/__init__.py (3)

backend/app/services/llm/providers/base.py (1)

BaseProvider (13-63)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (21-108)

backend/app/services/llm/providers/registry.py (2)

LLMProvider (14-40)

get_llm_provider (43-70)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (4)

backend/app/services/llm/providers/openai.py (2)

40-58: LGTM: Parameter building and conversation handling logic.

The parameter assembly correctly merges config parameters with query input, and the conversation handling properly covers three scenarios:

Using existing conversation ID when provided

Auto-creating a new conversation when requested

Omitting conversation context for one-off queries

59-87: LGTM: Response construction and success path.

The response building correctly maps OpenAI API response fields to the internal LLMCallResponse structure, properly extracts conversation ID, captures token usage, and conditionally includes the raw provider response when requested.

backend/app/services/llm/providers/registry.py (2)

14-40: LGTM: Registry pattern with clear provider constants.

The registry implementation properly uses a class constant OPENAI to avoid magic strings and provides clear error messaging that lists supported providers when an unsupported provider is requested.

43-58: LGTM: Credential retrieval and validation.

The function correctly retrieves provider credentials using the existing get_provider_credential CRUD function and provides clear error messaging when credentials are not configured.

backend/app/services/llm/providers/registry.py

…te tests for consistency

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (5)

backend/app/services/llm/jobs.py (4)
85-85: Silence unused parameter warning for Celery compatibility.

The task_instance parameter is required by Celery's task signature but intentionally unused. Mark it as such to satisfy the linter.

Apply this diff:
     task_id: str,
     task_instance,
 ) -> dict:
     """Celery task to process an LLM request asynchronously.

     Returns:
         dict: Serialized APIResponse[LLMCallResponse] on success, APIResponse[None] on failure
     """

     request = LLMCallRequest(**request_data)
     job_id: UUID = UUID(job_id)
+    del task_instance  # Intentionally unused (kept for Celery signature compatibility)
62-66: Critical SSRF vulnerability: user-controlled callback_url lacks validation.

The callback_url is passed directly to send_callback without validation, allowing attackers to:

Access internal services (RFC1918: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)

Query cloud metadata endpoints (169.254.169.254)

Perform port scanning and service enumeration

Implement URL validation that enforces:

HTTPS-only scheme

DNS resolution with IP blocking for private ranges and localhost

Disabled or validated redirects

Short timeouts and response size limits

Consider creating a hardened safe_send_callback helper in app.utils that performs these validations before making the HTTP request.

139-143: Callback failures incorrectly mark successful jobs as FAILED.

When send_callback raises an exception (network timeout, endpoint down, etc.), it propagates to the outer exception handler at line 163, which calls handle_job_error and marks the job as FAILED. This is incorrect: the LLM execution succeeded, and callback delivery failures should be non-fatal.

Wrap the callback in a try/except block:
             if request.callback_url:
+                try:
-                send_callback(
-                    callback_url=request.callback_url,
-                    data=callback.model_dump(),
-                )
+                    send_callback(
+                        callback_url=request.callback_url,
+                        data=callback.model_dump(),
+                    )
+                except Exception as e:
+                    logger.error(
+                        f"[execute_job] Callback delivery failed (non-fatal) | job_id={job_id}, "
+                        f"task_id={task_id}, error={str(e)}",
+                        exc_info=True,
+                    )
120-124: Sanitize error messages to prevent sensitive information leakage in callbacks.

The error handler passes exception messages directly to callbacks (line 122: str(ve)). Internal error details may expose sensitive information about system architecture, database structure, or configuration. Consider using generic error messages in callbacks while logging detailed errors server-side.

For example, at line 122, use a generic message for callbacks:
             except ValueError as ve:
+                logger.error(f"[execute_job] Provider error: {str(ve)} | job_id={job_id}")
                 callback = APIResponse.failure_response(
-                    error=str(ve),
+                    error="Provider configuration error",
                     metadata=request.request_metadata,
                 )
Apply similar sanitization at line 158 where provider errors are returned.
backend/app/services/llm/providers/openai.py (1)
94-103: Use OpenAI SDK v2 exception classes (APIError family).

The code catches openai.OpenAIError which does not exist in OpenAI Python SDK v2. The correct base exception is openai.APIError, with subclasses including APIConnectionError, APITimeoutError, APIStatusError, RateLimitError, and InternalServerError. This will cause unhandled exceptions at runtime.

Update the import and exception handler:
-import openai
+from openai import APIError, APIConnectionError, RateLimitError, APIStatusError

...

-        except openai.OpenAIError as e:
+        except (RateLimitError, APIConnectionError, APIStatusError, APIError) as e:
             # imported here to avoid circular imports
             from app.utils import handle_openai_error
Alternatively, catching just APIError alone would suffice since all subclasses inherit from it.

🧹 Nitpick comments (2)

backend/app/services/llm/providers/openai.py (1)
28-29: Redundant client assignment after super().init.

Line 29 assigns self.client = client, but super().__init__(client) on line 28 already sets this attribute in the BaseProvider base class (confirmed in backend/app/services/llm/providers/base.py lines 24-25). The duplicate assignment is harmless but unnecessary.

Apply this diff to remove the redundant line:
         super().__init__(client)
-        self.client = client
backend/app/services/llm/jobs.py (1)
165-165: Remove unnecessary f-string prefix.

The f-string on line 165 has no placeholders and should use a regular string literal.

Apply this diff:
         callback = APIResponse.failure_response(
-            error=f"Unexpected error occurred",
+            error="Unexpected error occurred",
             metadata=request.request_metadata,
         )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 365ab0d and cffbb3d.

📒 Files selected for processing (4)

backend/app/services/llm/jobs.py (1 hunks)
backend/app/services/llm/providers/openai.py (1 hunks)
backend/app/tests/services/llm/providers/test_openai.py (1 hunks)
backend/app/tests/services/llm/test_jobs.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

backend/app/tests/services/llm/providers/test_openai.py
backend/app/services/llm/jobs.py
backend/app/services/llm/providers/openai.py
backend/app/tests/services/llm/test_jobs.py

backend/app/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement business logic services under backend/app/services/

Files:

backend/app/services/llm/jobs.py
backend/app/services/llm/providers/openai.py

🧠 Learnings (1)

📚 Learning: 2025-10-08T12:05:01.317Z

Learnt from: CR
Repo: ProjectTech4DevAI/ai-platform PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-08T12:05:01.317Z
Learning: Applies to backend/app/services/response/**/*.py : Put OpenAI Responses API integration, conversation management, and job execution in backend/app/services/response/

Applied to files:

backend/app/services/llm/providers/openai.py

🧬 Code graph analysis (4)

backend/app/tests/services/llm/providers/test_openai.py (3)

backend/app/models/llm/request.py (3)

CompletionConfig (48-57)

QueryParams (34-45)

ConversationConfig (7-30)

backend/app/services/llm/providers/openai.py (2)

OpenAIProvider (21-110)

execute (31-110)

backend/app/tests/utils/openai.py (1)

mock_openai_response (57-111)

backend/app/services/llm/jobs.py (7)

backend/app/crud/jobs.py (1)

JobCrud (11-42)

backend/app/models/job.py (3)

JobStatus (9-13)

JobType (16-18)

JobUpdate (47-50)

backend/app/models/llm/request.py (1)

LLMCallRequest (69-88)

backend/app/models/llm/response.py (1)

LLMCallResponse (42-52)

backend/app/utils.py (3)

APIResponse (29-53)

failure_response (42-53)

success_response (36-39)

backend/app/celery/utils.py (1)

start_high_priority_job (18-43)

backend/app/services/llm/providers/registry.py (2)

get_llm_provider (43-70)

get (27-35)

backend/app/services/llm/providers/openai.py (4)

backend/app/models/llm/request.py (2)

CompletionConfig (48-57)

QueryParams (34-45)

backend/app/models/llm/response.py (4)

LLMCallResponse (42-52)

LLMOutput (15-18)

LLMResponse (21-39)

Usage (9-12)

backend/app/services/llm/providers/base.py (2)

BaseProvider (13-63)

execute (35-55)

backend/app/tests/services/llm/providers/test_openai.py (2)

completion_config (32-37)

provider (27-29)

backend/app/tests/services/llm/test_jobs.py (6)

backend/app/crud/jobs.py (1)

JobCrud (11-42)

backend/app/utils.py (2)

APIResponse (29-53)

failure_response (42-53)

backend/app/models/job.py (2)

JobStatus (9-13)

JobType (16-18)

backend/app/models/llm/request.py (4)

LLMCallRequest (69-88)

CompletionConfig (48-57)

QueryParams (34-45)

LLMCallConfig (60-66)

backend/app/models/llm/response.py (4)

LLMCallResponse (42-52)

LLMResponse (21-39)

LLMOutput (15-18)

Usage (9-12)

backend/app/services/llm/jobs.py (3)

start_job (19-50)

handle_job_error (53-76)

execute_job (79-172)

🪛 Ruff (0.14.2)

backend/app/services/llm/jobs.py

85-85: Unused function argument: task_instance

(ARG001)

165-165: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (12)

backend/app/services/llm/providers/openai.py (1)

31-87: Main execution logic is well-structured.

The provider correctly handles conversation configuration with three distinct paths (existing ID, auto-create, or no conversation), properly merges parameters, and constructs the normalized response. The conditional inclusion of raw provider response is appropriate.

backend/app/tests/services/llm/providers/test_openai.py (6)

21-43: Test fixtures are well-organized.

The fixtures properly set up mocks for the client, provider instance, and request parameters, enabling clean test isolation.

49-70: Comprehensive coverage of success path without conversation.

The test properly validates all response fields including usage metrics, provider metadata, and the absence of conversation_id and raw response when not requested.

160-179: Correct use of OpenAI SDK v2 APIError exception.

The test correctly uses openai.APIError (line 164), which is the proper exception class for OpenAI Python SDK v2. This confirms the provider implementation should use the same exception hierarchy.

192-208: Good edge case coverage for empty conversation config.

This test correctly validates that a ConversationConfig without id or auto_create results in no conversation parameter being passed to the API, preventing unexpected behavior.

209-230: Thorough validation of parameter merging.

The test ensures that completion config parameters are correctly merged with query input, confirming the provider properly combines static and dynamic parameters.

231-251: Essential test for conversation parameter cleanup.

This test validates that stale conversation data in the completion config is properly removed when no conversation configuration is provided in the query, preventing accidental reuse of old conversation IDs.

backend/app/tests/services/llm/test_jobs.py (4)

42-67: Well-structured test for successful job creation.

The test comprehensively validates job record creation, Celery task scheduling, and parameter passing, ensuring the integration between database and task queue is correct.

104-200: Comprehensive coverage of error handling scenarios.

The three tests thoroughly validate handle_job_error behavior: successful callback delivery, skipping callback when URL is absent, and ensuring database updates proceed even when callbacks fail. This coverage ensures robust error handling.

238-271: Excellent fixture design with shared environment setup.

The job_env fixture and _execute_job helper method effectively reduce test duplication while maintaining clarity. The centralized mock setup makes tests more maintainable.

350-365: Good validation of metadata pass-through.

This test ensures that client-provided request metadata is correctly propagated through the job execution flow to the callback, enabling request correlation.

backend/app/services/llm/jobs.py (1)

19-50: Well-structured job creation and task scheduling.

The function properly creates a database record, schedules a Celery task with all required parameters, handles failures by updating the job status before raising an HTTP exception, and maintains correlation IDs for tracing. The error handling flow is correct.

nishika26 · 2025-11-05T04:54:20Z

backend/app/api/main.py

 api_router.include_router(credentials.router)
 api_router.include_router(documents.router)
 api_router.include_router(doc_transformation_job.router)
+api_router.include_router(llm.router)


only curious if keeping the router name as "llm" could be confusing possibly, since we are taking care of the details of the service that an llm provider is providing, and exactly doing anything to the llm itself

avirajsingh7 added 5 commits October 21, 2025 08:38

intial commit v1 llm

d46e028

resolve using registry

e08fdcd

resolve using sqlModel

95be54b

Implement from llm request

a7f63d8

remove md

e76e2c8

avirajsingh7 marked this pull request as draft October 21, 2025 03:16

avirajsingh7 self-assigned this Oct 21, 2025

avirajsingh7 added the enhancement New feature or request label Oct 21, 2025

avirajsingh7 linked an issue Oct 21, 2025 that may be closed by this pull request

Build/Implement Unified LLM API v1 #409

Closed

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

avirajsingh7 added 3 commits October 21, 2025 15:08

rename OpenAISpec to OpenAIResponseSpec

2941118

Enhanced OpenAISpec Configuration

a8b9577

Define intial unified api contract

2d191ee

avirajsingh7 added 15 commits October 23, 2025 16:36

Use flexible json for config

6e359d5

Handle callback

bb74ab6

Refactor LLM provider architecture: remove factory pattern, introduce…

b6a3fd9

… registry for provider instantiation, and update OpenAI provider execution logic.

remove spec

86f6855

Refactor LLM provider modules: remove unnecessary docstrings, simplif…

47f4e25

…y comments, and clarify provider responsibilities.

Add support for including raw LLM provider response in API calls and …

9780d6b

…update response handling

Refactor LLM API and provider modules: update response handling in ll…

1cf2787

…m_call, reorganize imports, and enhance logging in OpenAIProvider.

Enhance documentation in QueryParams and CompletionConfig classes, an…

e87a6bc

…d fix logging format in OpenAIProvider.

Add LLM callback endpoint and improve job error handling in LLM services

2982a37

Refactor conversation handling in LLM request and OpenAI provider: ad…

46d9a53

…d ConversationConfig, update query parameters, and enhance conversation management logic.

Update OpenAI package version to 1.100.0 in pyproject.toml and uv.lock

9867739

Refactor LLM response models: rename Diagnostics to Usage, update LLM…

3e7b015

…CallResponse to use Usage, and adjust OpenAIProvider to construct Usage correctly.

Add llm/jobs.py tests and update callback_url type to HttpUrl

935f7cd

Tests for registry.py

a4a1d71

Add tests for OpenAIProvider and enhance mock_openai_response to supp…

3c8f1e8

…ort conversation ID

avirajsingh7 requested review from AkhileshNegi, kartpop and nishika26 October 29, 2025 06:09

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

Fix request_data serialization in start_job to use JSON mode

fb4959a

kartpop requested changes Oct 30, 2025

View reviewed changes

nishika26 reviewed Oct 30, 2025

View reviewed changes

backend/pyproject.toml Outdated Show resolved Hide resolved

backend/app/models/llm/request.py Show resolved Hide resolved

backend/app/models/llm/response.py Show resolved Hide resolved

backend/app/services/llm/providers/base.py Show resolved Hide resolved

nishika26 reviewed Oct 31, 2025

View reviewed changes

avirajsingh7 added 4 commits October 31, 2025 13:53

resolve comments

6ade78c

keep init

97de9a8

Enhance ConversationConfig validation and update request_metadata des…

39e68e5

…cription for clarity

Refactor LLM provider registry to use class-based structure and updat…

365ab0d

…e tests accordingly

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

backend/app/services/llm/providers/registry.py Show resolved Hide resolved

avirajsingh7 added 2 commits November 3, 2025 09:50

Improve error handling in LLM job execution and OpenAI provider; upda…

83f7303

…te tests for consistency

precommit

cffbb3d

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

avirajsingh7 added 2 commits November 3, 2025 11:11

fix openai sdk version

a16e99d

dependecies

ff80246

avirajsingh7 requested review from kartpop and nishika26 November 3, 2025 05:48

kartpop approved these changes Nov 4, 2025

View reviewed changes

avirajsingh7 added 2 commits November 5, 2025 10:24

Merge remote-tracking branch 'origin/main' into feature/unified_v1

c887c31

pre commit

12f5300

nishika26 reviewed Nov 5, 2025

View reviewed changes

nishika26 approved these changes Nov 5, 2025

View reviewed changes

avirajsingh7 merged commit 04103ad into main Nov 5, 2025
3 checks passed

avirajsingh7 deleted the feature/unified_v1 branch November 5, 2025 05:16

coderabbitai bot mentioned this pull request Nov 5, 2025

Evaluation #405

Merged

2 tasks

coderabbitai bot mentioned this pull request Nov 25, 2025

Integrate Config Management with Unified Api #447

Merged

2 tasks

coderabbitai bot mentioned this pull request Dec 3, 2025

Onboarding : take multiple credentials from endpoint #453

Merged

2 tasks

Implement Unified LLM API v1 to Simplify LLM Integrations and Enable Multi-Provider Extensibility #413

Implement Unified LLM API v1 to Simplify LLM Integrations and Enable Multi-Provider Extensibility #413

Uh oh!

Conversation

avirajsingh7 commented Oct 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Features

Example Request

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kartpop left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

nishika26 Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

avirajsingh7 commented Oct 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 21, 2025 •

edited

Loading

codecov bot commented Oct 22, 2025 •

edited

Loading