Classification: Merging endpoints #391

AkhileshNegi · 2025-10-06T04:58:11Z

Summary

Issue: #362
Enhancing the fine-tuning workflow by adding automatic evaluation triggers and improving status handling.

Fine-tuning API improvements: Added logic to automatically trigger evaluation when status changes from in_progress to completed
Status ENUM update: Added cancelled status to fine-tuning model
Test coverage: Significantly expanded test cases for fine-tuning API

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Summary by CodeRabbit

New Features
- Upload a CSV to start fine-tuning via form inputs (base model, split ratios, system prompt).
- Automatically creates multiple fine-tuning jobs per provided split ratios and links them to the uploaded dataset.
- Retrieval endpoints return signed URLs for uploaded documents.
- Fine-tune status now includes a “cancelled” state.
- On job completion, an automatic model evaluation is queued if none exists.
Tests
- Added tests covering CSV/cloud storage flows and automatic evaluation triggering.

…ged from in progress to completed

coderabbitai · 2025-10-06T04:58:22Z

Walkthrough

Adds async CSV fine-tune upload handling: validates and stores CSV, creates a Document, spawns multiple FineTuning jobs per split ratio, and returns document_id; refresh now maps "cancelled" status and conditionally creates/queues ModelEvaluation when jobs complete. Tests migrated to moto S3 and validate auto-evaluation logic.

Changes

Cohort / File(s)	Summary
API routes: CSV upload, job creation, status refresh `backend/app/api/routes/fine_tuning.py`	Converts `fine_tune_from_CSV` to an async File/Form handler; validates CSV and parses `split_ratio`; uploads CSV to cloud storage and creates a `Document`; creates multiple FineTuning jobs (one per split); returns `document_id`; updates `refresh_fine_tune_status` to accept `BackgroundTasks`, map `"cancelled"` to `FineTuningStatus.cancelled`, and auto-create/queue `ModelEvaluation` when appropriate; removes route-local `handle_openai_error`.
Core evaluation: predictions & storage `backend/app/core/finetune/evaluation.py`	Minor refactors and formatting: imports `uuid` and uses it to generate unique prediction CSV filenames; switches `handle_openai_error` import to internal `app.utils`; adjusts logging/message formatting; preserves retry/timeouts and evaluation logic.
Model enum `backend/app/models/fine_tuning.py`	Adds `FineTuningStatus.cancelled = "cancelled"`.
Tests: S3, mocks, auto-eval scenarios `backend/app/tests/api/routes/test_fine_tuning.py`	Reworks tests to use moto S3 and real boto3 client, patches cloud storage/OpenAI/process_fine_tuning_job; asserts creation of pending jobs per split ratio; adds suites verifying auto-evaluation creation on completion, skipping when active evals exist, and ModelEvaluation record integrity.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant API as FineTuning API
  participant Storage as Cloud Storage
  participant DB as Database
  participant BG as Background Tasks

  User->>API: POST /fine_tune_from_CSV (file, base_model, split_ratio, system_prompt)
  API->>API: Validate CSV and parse split_ratio
  API->>Storage: Upload CSV
  Storage-->>API: file_key / signed URL
  API->>DB: Create Document (with storage key)
  DB-->>API: document_id
  loop for each split ratio
    API->>DB: Create FineTuningJob (pending, document_id, base_model, split)
  end
  API-->>User: { document_id, jobs[] }

  Note over API,BG: Background processing handles provider interactions and job execution

sequenceDiagram
  autonumber
  actor User
  participant API as FineTuning API
  participant Provider as FT Provider
  participant DB as Database
  participant BG as Background Tasks

  User->>API: POST /fine_tune/{id}/refresh
  API->>Provider: Fetch job status
  Provider-->>API: status (running/completed/failed/cancelled) + fine_tuned_model?
  API->>DB: Update FineTuningJob status
  alt status == completed and no active evaluation exists
    API->>DB: Create ModelEvaluation (pending)
    API->>BG: Queue run_model_evaluation(ModelEvaluation.id)
  else status == completed and active evaluation exists
    API-->>API: Skip creating evaluation
  else not completed
    API-->>API: No evaluation action
  end
  API-->>User: Updated job info

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Classification Feature #320: Direct overlap on fine-tuning routes, CSV/upload handling, and evaluation wiring — likely touches the same modules and tests.

Suggested reviewers

kartpop
avirajsingh7

Poem

I thump my paw on CSV plains,
I split the rows in tidy trains,
Buckets hum while jobs are queued,
When finished, evals are pursued,
Rabbity cheers — all tests green-tuned! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	The current title "Classification: Merging endpoints" does not reflect the main changes in this pull request, which focus on enhancing the fine-tuning API with CSV support, automatic evaluation triggers, and adding a cancelled status. It misleads by suggesting classification endpoint merging rather than fine-tuning workflow improvements. Consequently, the title fails to clearly summarize the primary changes made.	Please rename the pull request to concisely convey the main updates, for example: “Enhance fine-tuning API with CSV support, automatic evaluation triggers, and cancelled status.”
Docstring Coverage	⚠️ Warning	Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/classification-unified-api

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-10-06T05:01:25Z

Codecov Report

❌ Patch coverage is 97.72727% with 4 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
backend/app/api/routes/fine_tuning.py	89.65%	3 Missing ⚠️
backend/app/tests/api/routes/test_fine_tuning.py	99.29%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

backend/app/core/finetune/evaluation.py (1)

99-116: Possible infinite loop on latency timeout in generate_predictions.

When elapsed_time > max_latency you log and continue without incrementing attempt, causing a tight infinite loop under chronic slowness. Increment attempts and fail gracefully after retries.

Apply this diff:

-                    elapsed_time = time.time() - start_time
-                    if elapsed_time > self.max_latency:
-                        logger.warning(
-                            f"[generate_predictions] Timeout exceeded for prompt {idx}/{total_prompts}. Retrying..."
-                        )
-                        continue
+                    elapsed_time = time.time() - start_time
+                    if elapsed_time > self.max_latency:
+                        attempt += 1
+                        logger.warning(
+                            f"[generate_predictions] Timeout exceeded for prompt {idx}/{total_prompts} "
+                            f"({elapsed_time:.2f}s > {self.max_latency}s). Attempt {attempt}/{self.retries}."
+                        )
+                        if attempt == self.retries:
+                            predictions.append("timeout")
+                            logger.error(
+                                f"[generate_predictions] Maximum retries reached for prompt {idx}/{total_prompts} after timeouts. "
+                                "Appending 'timeout'."
+                            )
+                            break
+                        else:
+                            # brief backoff to avoid hot loop
+                            time.sleep(1)
+                        continue

Additionally, consider guarding normalize_prediction when allowed_labels is empty to avoid StopIteration (e.g., return the raw label or a sentinel).

Also applies to: 125-168

backend/app/api/routes/fine_tuning.py (2)

94-109: handle_openai_error is undefined — replace with str(e) to avoid NameError.

This will crash both background processing and refresh paths. Use stringified exception (consistent with evaluation.py change).

-            except openai.OpenAIError as e:
-                error_msg = handle_openai_error(e)
+            except openai.OpenAIError as e:
+                error_msg = str(e)
                 logger.error(
                     f"[process_fine_tuning_job] Failed to upload to OpenAI: {error_msg} | "
                     f"job_id={job_id}, project_id={project_id}|"
                 )
@@
-            except openai.OpenAIError as e:
-                error_msg = handle_openai_error(e)
+            except openai.OpenAIError as e:
+                error_msg = str(e)
                 logger.error(
                     f"[process_fine_tuning_job] Failed to create OpenAI fine-tuning job: {error_msg} | "
                     f"job_id={job_id}, project_id={project_id}|"
                 )
                 update_finetune_job(
@@
-        except openai.OpenAIError as e:
-            error_msg = handle_openai_error(e)
+        except openai.OpenAIError as e:
+            error_msg = str(e)
             logger.error(
                 f"[Retrieve_fine_tune_status] Failed to retrieve OpenAI job | "
                 f"provider_job_id={mask_string(job.provider_job_id)}, "
                 f"error={error_msg}, fine_tuning_id={fine_tuning_id}, project_id={project_id}"
             )

Also applies to: 123-137, 312-325

161-176: Guard update on failure when fine_tune is None.

If fetch_by_id fails before assignment, the except block will call update_finetune_job with job=None, masking the original error. Guard it.

         except Exception as e:
             error_msg = str(e)
             logger.error(
                 f"[process_fine_tuning_job] Background job failure: {e} | "
                 f"job_id={job_id}, project_id={project_id}|"
             )
-            update_finetune_job(
-                session=session,
-                job=fine_tune,
-                update=FineTuningUpdate(
-                    status=FineTuningStatus.failed,
-                    error_message="Error while processing the background job : "
-                    + error_msg,
-                ),
-            )
+            if fine_tune is not None:
+                update_finetune_job(
+                    session=session,
+                    job=fine_tune,
+                    update=FineTuningUpdate(
+                        status=FineTuningStatus.failed,
+                        error_message="Error while processing the background job : " + error_msg,
+                    ),
+                )
+            else:
+                logger.error(
+                    "[process_fine_tuning_job] Could not mark job failed because fine_tune was not loaded."
+                )

🧹 Nitpick comments (5)

backend/app/tests/api/routes/test_fine_tuning.py (3)
18-30: Remove unused helper create_file_mock.

It’s not referenced; drop to reduce noise.
-def create_file_mock(file_type):
-    counter = {"train": 0, "test": 0}
-
-    def _side_effect(file=None, purpose=None):
-        if purpose == "fine-tune":
-            if "train" in file.name:
-                counter["train"] += 1
-                return MagicMock(id=f"file_{counter['train']}")
-            elif "test" in file.name:
-                counter["test"] += 1
-                return MagicMock(id=f"file_{counter['test']}")
-
-    return _side_effect
59-62: Avoid writing temporary JSONL files in this test.

process_fine_tuning_job is patched; these files aren’t used. Removing speeds up tests and avoids FS writes.
-        for path in ["/tmp/train.jsonl", "/tmp/test.jsonl"]:
-            with open(path, "w") as f:
-                f.write('{"prompt": "test", "completion": "label"}')
+        # No-op: background processing is mocked; no local files needed
111-116: Compare status using the Enum, not a string.

DB model uses FineTuningStatus; assert against the Enum to avoid brittle string comparisons.
-            assert (
-                job.status == "pending"
-            )  # Since background processing is mocked, status remains pending
+            assert job.status == FineTuningStatus.pending  # background job is mocked
backend/app/api/routes/fine_tuning.py (2)

326-340: Type alignment: mapped_status is an Enum; FineTuningUpdate.status is str.

Not a blocker, but consider keeping types consistent (use FineTuningStatus throughout). Also rename annotation to Optional[FineTuningStatus] for clarity.

Also applies to: 341-346

213-216: CSV validation condition is permissive by design — confirm intended.

Current check accepts if either extension OR content-type indicates CSV. If you intended both, switch to “or” inside the negation. Otherwise, this is fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a41d99 and 724497b.

📒 Files selected for processing (4)

backend/app/api/routes/fine_tuning.py (7 hunks)
backend/app/core/finetune/evaluation.py (1 hunks)
backend/app/models/fine_tuning.py (1 hunks)
backend/app/tests/api/routes/test_fine_tuning.py (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

backend/app/tests/api/routes/test_fine_tuning.py (6)

backend/app/tests/utils/test_data.py (1)

create_test_fine_tuning_jobs (133-160)

backend/app/tests/utils/utils.py (1)

get_document (116-137)

backend/app/models/fine_tuning.py (2)

Fine_Tuning (55-90)

FineTuningStatus (13-18)

backend/app/models/model_evaluation.py (2)

ModelEvaluation (38-85)

ModelEvaluationStatus (14-18)

backend/app/core/cloud/storage.py (5)

client (30-42)

put (120-122)

put (150-177)

get_signed_url (135-137)

get_signed_url (217-242)

backend/app/tests/conftest.py (2)

db (24-41)

user_api_key_header (77-79)

backend/app/api/routes/fine_tuning.py (9)

backend/app/models/fine_tuning.py (4)

FineTuningJobCreate (29-52)

FineTuningJobPublic (104-120)

FineTuningUpdate (93-101)

FineTuningStatus (13-18)

backend/app/models/document.py (1)

Document (20-40)

backend/app/models/model_evaluation.py (2)

ModelEvaluationBase (21-26)

ModelEvaluationStatus (14-18)

backend/app/crud/model_evaluation.py (2)

create_model_evaluation (20-58)

fetch_active_model_evals (150-170)

backend/app/core/finetune/preprocessing.py (1)

DataPreprocessor (17-157)

backend/app/api/routes/model_evaluation.py (1)

run_model_evaluation (49-112)

backend/app/utils.py (1)

get_openai_client (175-205)

backend/app/core/cloud/storage.py (3)

get_cloud_storage (262-279)

put (120-122)

put (150-177)

backend/app/crud/document.py (1)

DocumentCrud (14-135)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (1)

backend/app/models/fine_tuning.py (1)

13-19: Add “cancelled” status — looks good; verify downstream handling and persistence.

Enum addition is fine. Please double-check:

OPENAI status mapping includes “cancelled” (it does in API routes).

Any DB constraints/migrations aren’t needed (column appears to be a string enum in SQLModel, so likely no migration, but verify).

…ctTech4DevAI/ai-platform into feature/classification-unified-api

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

backend/app/core/finetune/evaluation.py (1)
5-5: Consider using built-in set type.

typing.Set is deprecated in Python 3.9+. Use the built-in set instead.

Apply this diff:
-from typing import Set
+from typing import Any
And update line 38:
-        self.allowed_labels: Set[str] = set()
+        self.allowed_labels: set[str] = set()
Based on static analysis.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 724497b and 1bbd0dd.

📒 Files selected for processing (1)

backend/app/core/finetune/evaluation.py (12 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

backend/app/core/finetune/evaluation.py (2)

backend/app/core/cloud/storage.py (1)

AmazonCloudStorage (145-259)

backend/app/utils.py (1)

handle_openai_error (208-222)

🪛 Ruff (0.13.3)

backend/app/core/finetune/evaluation.py

5-5: typing.Set is deprecated, use set instead

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (4)

backend/app/core/finetune/evaluation.py (4)

52-53: Log message formatting improvements enhance readability.

The string splitting across multiple lines improves code readability while maintaining semantic equivalence. These changes follow Python best practices for long strings.

Also applies to: 68-70, 73-74, 89-92, 96-97, 118-119, 125-126, 137-139, 152-153, 165-166, 172-174, 178-180, 185-187, 206-207, 216-217, 221-222, 232-233

198-199: UUID-based filename generation prevents collisions.

Using uuid.uuid4().hex to generate unique filenames is a good practice that prevents potential filename collisions in S3 storage.

248-251: Multi-line docstring follows PEP 257 conventions.

Converting to a multi-line docstring improves readability and adheres to Python documentation best practices.

3-4: Import path update validated; no remaining imports of handle_openai_error from the old module. Addition of uuid import supports unique filename generation. Approving changes.

backend/app/api/routes/fine_tuning.py

backend/app/core/finetune/evaluation.py

backend/app/tests/api/routes/test_fine_tuning.py

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

backend/app/api/routes/fine_tuning.py (2)

94-109: Undefined handle_openai_error and fragile exception type (openai.OpenAIError) — will raise at runtime

handle_openai_error is not defined/imported; these blocks will crash on exception. Also, with the current OpenAI Python client, OpenAIError may not exist; prefer catching Exception or openai.APIError variants.

Apply these diffs to make error handling robust and avoid NameError:

-            except openai.OpenAIError as e:
-                error_msg = handle_openai_error(e)
+            except Exception as e:
+                error_msg = getattr(e, "message", str(e))
                 logger.error(
                     f"[process_fine_tuning_job] Failed to upload to OpenAI: {error_msg} | "
                     f"job_id={job_id}, project_id={project_id}|"
                 )

-            except openai.OpenAIError as e:
-                error_msg = handle_openai_error(e)
+            except Exception as e:
+                error_msg = getattr(e, "message", str(e))
                 logger.error(
                     f"[process_fine_tuning_job] Failed to create OpenAI fine-tuning job: {error_msg} | "
                     f"job_id={job_id}, project_id={project_id}|"
                 )

-        except openai.OpenAIError as e:
-            error_msg = handle_openai_error(e)
+        except Exception as e:
+            error_msg = getattr(e, "message", str(e))
             logger.error(
                 f"[Retrieve_fine_tune_status] Failed to retrieve OpenAI job | "
                 f"provider_job_id={mask_string(job.provider_job_id)}, "
                 f"error={error_msg}, fine_tuning_id={fine_tuning_id}, project_id={project_id}"
             )
             raise HTTPException(
                 status_code=502, detail=f"OpenAI API error: {error_msg}"
             )

Also applies to: 123-138, 303-313

161-175: Guard update on background failure when fine_tune is None

If fetch_by_id failed before assignment, update_finetune_job will be called with job=None.

-            update_finetune_job(
-                session=session,
-                job=fine_tune,
-                update=FineTuningUpdate(
-                    status=FineTuningStatus.failed,
-                    error_message="Error while processing the background job : "
-                    + error_msg,
-                ),
-            )
+            if fine_tune is not None:
+                update_finetune_job(
+                    session=session,
+                    job=fine_tune,
+                    update=FineTuningUpdate(
+                        status=FineTuningStatus.failed,
+                        error_message="Error while processing the background job : "
+                        + error_msg,
+                    ),
+                )

🧹 Nitpick comments (6)

backend/app/api/routes/fine_tuning.py (6)
202-205: CSV validation is brittle; accept common CSV MIME types or require at least one signal

Current check rejects only when both extension and content-type are wrong; also many clients send application/vnd.ms-excel for CSV. Prefer validating if neither suggests CSV.
-    if not file.filename.lower().endswith(".csv") and file.content_type != "text/csv":
-        raise HTTPException(status_code=400, detail="File must be a CSV file")
+    content_type = (file.content_type or "").lower()
+    if not (file.filename.lower().endswith(".csv") or "csv" in content_type):
+        raise HTTPException(status_code=400, detail="File must be a CSV file")
213-217: S3 object key lacks filename/extension; prefer a clear, namespaced key

Storing at just UUID loses context and extension; use a stable prefix and keep .csv for easier inspection and tooling.
-    document_id = uuid4()
-    object_store_url = storage.put(file, Path(str(document_id)))
+    document_id = uuid4()
+    key = Path("datasets") / f"{document_id}.csv"
+    object_store_url = storage.put(file, key)
197-201: Deduplicate and validate split ratios before creating jobs

Avoid duplicate jobs when users pass repeated ratios, and enforce model-level validation only once.
-    try:
-        split_ratios = [float(r.strip()) for r in split_ratio.split(",")]
+    try:
+        split_ratios = [float(r.strip()) for r in split_ratio.split(",")]
+        # remove duplicates, preserve order
+        seen = set()
+        split_ratios = [r for r in split_ratios if (r not in seen and not seen.add(r))]
315-317: Type hint mismatch for mapped_status

The mapping returns FineTuningStatus, not str. Update annotation.
-        mapped_status: Optional[str] = OPENAI_TO_INTERNAL_STATUS.get(
+        mapped_status: Optional[FineTuningStatus] = OPENAI_TO_INTERNAL_STATUS.get(
             getattr(openai_job, "status", None)
         )
77-83: Cleanup may not run if preprocessing fails before upload

preprocessor.cleanup() is only in the inner upload try/finally; failures in preprocessor.process() can bypass cleanup and leak temp artifacts.

Wrap creation, processing, and upload in a broader try/finally and always call cleanup() if preprocessor was created:

Initialize preprocessor = None before use.

After creating it, use a surrounding try: ... finally: preprocessor.cleanup().

I can propose an exact patch if you prefer.

Also applies to: 110-114

280-282: Response payload: include counts for better UX (optional)

Consider adding created_count/total to the response for clients to act on.
-    return APIResponse.success_response(
-        {"message": message, "document_id": str(created_document.id), "jobs": job_infos}
-    )
+    return APIResponse.success_response(
+        {
+            "message": message,
+            "document_id": str(created_document.id),
+            "created_count": created_count,
+            "total": total,
+            "jobs": job_infos,
+        }
+    )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1bbd0dd and c47b254.

📒 Files selected for processing (1)

backend/app/api/routes/fine_tuning.py (7 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

backend/app/api/routes/fine_tuning.py (9)

backend/app/models/fine_tuning.py (4)

FineTuningJobCreate (29-52)

FineTuningJobPublic (104-120)

FineTuningUpdate (93-101)

FineTuningStatus (13-18)

backend/app/models/document.py (1)

Document (20-40)

backend/app/models/model_evaluation.py (2)

ModelEvaluationBase (21-26)

ModelEvaluationStatus (14-18)

backend/app/crud/model_evaluation.py (2)

create_model_evaluation (20-58)

fetch_active_model_evals (150-170)

backend/app/core/finetune/preprocessing.py (1)

DataPreprocessor (17-157)

backend/app/api/routes/model_evaluation.py (1)

run_model_evaluation (49-112)

backend/app/utils.py (1)

get_openai_client (175-205)

backend/app/core/cloud/storage.py (3)

get_cloud_storage (262-279)

put (120-122)

put (150-177)

backend/app/crud/document.py (1)

DocumentCrud (14-135)

coderabbitai · 2025-10-08T06:40:45Z

backend/app/api/routes/fine_tuning.py

+        # Check if status is changing from running to completed
+        is_newly_completed = (
+            job.status == FineTuningStatus.running
+            and update_payload.status == FineTuningStatus.completed
+        )
+


⚠️ Potential issue | 🟠 Major

Race: auto-evaluation can be created twice under concurrent refresh calls

Two concurrent requests can both observe “no active evals” and create duplicates. Add a DB-level guard (unique constraint/partial index) or a transactional re-check/catch on insert.

Option A (preferred): add a unique index on model_evaluation.fine_tuning_id where is_deleted=false and status != 'failed', then wrap create in try/except IntegrityError; on conflict, skip.

Option B: within a session transaction, lock the fine-tuning row (SELECT ... FOR UPDATE), re-check active evals, then create.

I can provide a concrete migration + guarded create if desired.

Also applies to: 343-369

🤖 Prompt for AI Agents

In backend/app/api/routes/fine_tuning.py around lines 330-335 (also applies to 343-369), two concurrent requests can both see “no active evals” and create duplicate model_evaluation rows; implement a DB-level guard and guarded create. Preferred fix: add a migration that creates a unique partial index on model_evaluation(fine_tuning_id) WHERE is_deleted = false AND status != 'failed', then wrap the create in a try/except catching IntegrityError and skip/return if conflict occurs. Alternative: perform the check-and-create inside a DB transaction that locks the fine_tuning row (SELECT ... FOR UPDATE), re-check for active evals, then insert. Ensure sessions/transactions are used consistently and surface a clear response when insert is skipped due to integrity conflict.

backend/app/api/routes/fine_tuning.py

backend/app/tests/api/routes/test_fine_tuning.py

backend/app/api/routes/fine_tuning.py

…ctTech4DevAI/ai-platform into feature/classification-unified-api

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

backend/app/tests/api/routes/test_fine_tuning.py (2)
42-53: Consider extracting S3 setup to a pytest fixture.

The S3 bucket setup logic is correct and handles regional differences properly. However, if additional tests need S3 mocking in the future, this setup would need to be duplicated.

Consider creating a reusable fixture:
@pytest.fixture
def mock_s3_bucket():
    with mock_aws():
        s3 = boto3.client("s3", region_name=settings.AWS_DEFAULT_REGION)
        bucket_name = settings.AWS_S3_BUCKET_PREFIX
        if settings.AWS_DEFAULT_REGION == "us-east-1":
            s3.create_bucket(Bucket=bucket_name)
        else:
            s3.create_bucket(
                Bucket=bucket_name,
                CreateBucketConfiguration={
                    "LocationConstraint": settings.AWS_DEFAULT_REGION
                },
            )
        yield s3
110-115: Use enum values instead of string literals for status checks.

Line 113 uses a string literal "pending" instead of the imported FineTuningStatus.pending enum value. This pattern appears throughout the test file (lines 146, 174, 203, 260, 338, 392, 409, 456).

Apply this diff to use the enum consistently:
 for job in jobs:
     db.refresh(job)
     assert (
-        job.status == "pending"
+        job.status == FineTuningStatus.pending
     )  # Since background processing is mocked, status remains pending
     assert job.split_ratio in [0.5, 0.7, 0.9]
Repeat this change for all similar assertions throughout the file (lines 146, 174, 203, 260, 338, 392, 409, 456).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c04275b and 8801b39.

📒 Files selected for processing (1)

backend/app/tests/api/routes/test_fine_tuning.py (5 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

backend/app/tests/api/routes/test_fine_tuning.py

🧬 Code graph analysis (1)

backend/app/tests/api/routes/test_fine_tuning.py (6)

backend/app/tests/utils/test_data.py (1)

create_test_fine_tuning_jobs (133-160)

backend/app/tests/utils/utils.py (1)

get_document (116-137)

backend/app/models/fine_tuning.py (2)

Fine_Tuning (55-90)

FineTuningStatus (13-18)

backend/app/models/model_evaluation.py (2)

ModelEvaluation (38-85)

ModelEvaluationStatus (14-18)

backend/app/core/cloud/storage.py (5)

client (30-42)

put (120-122)

put (150-177)

get_signed_url (135-137)

get_signed_url (217-242)

backend/app/tests/conftest.py (2)

db (24-41)

user_api_key_header (77-79)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (9)

backend/app/tests/api/routes/test_fine_tuning.py (9)

1-15: LGTM!

The imports are well-organized and appropriate for the moto-based S3 mocking migration. The addition of FineTuningStatus, ModelEvaluation, and ModelEvaluationStatus aligns with the new auto-evaluation testing requirements.

118-180: LGTM with minor note!

The retrieve tests properly validate status synchronization from OpenAI and error message propagation. The updated provider_job_id naming convention ("ftjob-mock_job_123") is consistent throughout.

Note: The string literal status assertions (lines 146, 174) should use enum values as suggested in the previous comment.

182-203: LGTM!

The fetch test properly verifies that multiple jobs can be retrieved by document ID and that their attributes are correctly populated.

206-212: LGTM!

The test class is well-structured with appropriate decorators and a clear docstring explaining its purpose. The class-level patches ensure all test methods have the necessary mocks.

276-351: LGTM with verification note!

This test properly validates that duplicate evaluations are not created when an active evaluation already exists. The logic correctly checks for existing evaluations before triggering a new one.

Note: The same foreign key field verification mentioned in the previous comment applies to line 346.

352-411: LGTM!

Excellent test coverage for non-completion status transitions. The test validates that evaluations are only triggered on completion, not for intermediate states (running) or failures. This prevents unnecessary evaluation attempts.

412-467: LGTM!

This test properly validates that refreshing an already completed job doesn't trigger new evaluations. The logic ensures evaluation is only triggered on the initial status transition to completed, preventing duplicate evaluation jobs.

213-275: No changes needed for ModelEvaluation foreign key
Verified that ModelEvaluationBase defines fine_tuning_id as a foreign key. The test's usage of model_eval.fine_tuning_id is correct.

73-77: No change required for mock return type. The route casts the put result to str before persisting and Document.object_store_url is defined as a string, so returning a string in the test accurately reflects its usage.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-10-10T04:30:54Z

backend/app/tests/api/routes/test_fine_tuning.py

        for path in ["/tmp/train.jsonl", "/tmp/test.jsonl"]:
            with open(path, "w") as f:
-                f.write("{}")
-
-        mock_preprocessor = MagicMock()
-        mock_preprocessor.process.return_value = {
-            "train_jsonl_temp_filepath": "/tmp/train.jsonl",
-            "train_csv_s3_object": "s3://bucket/train.csv",
-            "test_csv_s3_object": "s3://bucket/test.csv",
-        }
-        mock_preprocessor.cleanup = MagicMock()
-        mock_preprocessor_cls.return_value = mock_preprocessor
-
-        mock_openai = MagicMock()
-        mock_openai.files.create.side_effect = create_file_mock("fine-tune")
-        mock_openai.fine_tuning.jobs.create.side_effect = [
-            MagicMock(id=f"ft_mock_job_{i}", status="running") for i in range(1, 4)
-        ]
-        mock_get_openai_client.return_value = mock_openai
-
-        body = {
-            "document_id": str(document.id),
-            "base_model": "gpt-4",
-            "split_ratio": [0.5, 0.7, 0.9],
-            "system_prompt": "you are a model able to classify",
-        }
-
-        with patch("app.api.routes.fine_tuning.Session") as SessionMock:
-            SessionMock.return_value.__enter__.return_value = db
-            SessionMock.return_value.__exit__.return_value = None
-
-            response = client.post(
-                "/api/v1/fine_tuning/fine_tune",
-                json=body,
-                headers=user_api_key_header,
-            )
+                f.write('{"prompt": "test", "completion": "label"}')


⚠️ Potential issue | 🟡 Minor

Add cleanup for temporary test files.

The test creates files in /tmp/ but doesn't clean them up after the test completes. This can lead to test pollution and potential conflicts in CI/CD environments.

Consider using pytest.fixture with cleanup or tempfile.TemporaryDirectory():

+from tempfile import TemporaryDirectory + # Setup test files for preprocessing -for path in ["/tmp/train.jsonl", "/tmp/test.jsonl"]: - with open(path, "w") as f: - f.write('{"prompt": "test", "completion": "label"}') +with TemporaryDirectory() as tmpdir: + for fname in ["train.jsonl", "test.jsonl"]: + path = os.path.join(tmpdir, fname) + with open(path, "w") as f: + f.write('{"prompt": "test", "completion": "label"}')

Alternatively, use a pytest fixture with automatic cleanup:

@pytest.fixture def temp_jsonl_files(): paths = ["/tmp/train.jsonl", "/tmp/test.jsonl"] for path in paths: with open(path, "w") as f: f.write('{"prompt": "test", "completion": "label"}') yield paths for path in paths: if os.path.exists(path): os.remove(path)

🤖 Prompt for AI Agents

In backend/app/tests/api/routes/test_fine_tuning.py around lines 59 to 61 the test writes files directly into /tmp without cleaning them up; change the test to create temporary files and ensure they are removed after the test — either replace the hardcoded /tmp paths with tempfile.NamedTemporaryFile or tempfile.TemporaryDirectory to produce paths that are automatically cleaned, or implement a pytest fixture that creates the files, yields their paths, and deletes them in the teardown (remove each path with os.remove if exists). Ensure any required imports (tempfile, os, pytest) are added and that the test uses the fixture or temp file paths instead of hardcoded /tmp paths.

AkhileshNegi added 16 commits September 17, 2025 18:51

experimenting claude

c939d3a

first stab with claude

1482a0c

first stab with claude

c3364d5

adding additional logic to call evaluation directly if status is chan…

ebcd9a0

…ged from in progress to completed

updating testcases

00b415f

added more testcases

a2ef005

added cancelled status in enum

f8e28e9

cleanups

397807d

update claude.md

ca862cf

coderabbit suggestion

375eb5e

Merge branch 'main' into feature/claude

14b7db5

reverting unnecessary changes

38dcf45

coderabbit suggestions

8a1b496

remove import

9e8d046

Merge branch 'main' into feature/claude

dc7a3ce

merging endpoints

724497b

AkhileshNegi self-assigned this Oct 6, 2025

AkhileshNegi marked this pull request as ready for review October 6, 2025 04:59

AkhileshNegi added the enhancement New feature or request label Oct 6, 2025

coderabbitai bot reviewed Oct 6, 2025

View reviewed changes

AkhileshNegi added 3 commits October 6, 2025 11:06

Merge branch 'main' into feature/classification-unified-api

3bcd772

following PEP8 standards

b6a7073

Merge branch 'feature/classification-unified-api' of github.com:Proje…

1bbd0dd

…ctTech4DevAI/ai-platform into feature/classification-unified-api

coderabbitai bot reviewed Oct 6, 2025

View reviewed changes

AkhileshNegi requested review from kartpop and nishika26 October 6, 2025 06:13

nishika26 requested changes Oct 7, 2025

View reviewed changes

backend/app/api/routes/fine_tuning.py Show resolved Hide resolved

backend/app/api/routes/fine_tuning.py Outdated Show resolved Hide resolved

backend/app/core/finetune/evaluation.py Show resolved Hide resolved

backend/app/tests/api/routes/test_fine_tuning.py Show resolved Hide resolved

AkhileshNegi added the ready-for-review label Oct 8, 2025

removed redundant checks

c47b254

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

Merge branch 'main' into feature/classification-unified-api

a5323a1

nishika26 reviewed Oct 9, 2025

View reviewed changes

backend/app/api/routes/fine_tuning.py Outdated Show resolved Hide resolved

backend/app/tests/api/routes/test_fine_tuning.py Show resolved Hide resolved

nishika26 requested changes Oct 9, 2025

View reviewed changes

backend/app/api/routes/fine_tuning.py Show resolved Hide resolved

AkhileshNegi added 3 commits October 10, 2025 09:40

added as todo

0979fd8

Merge branch 'main' into feature/classification-unified-api

9f0f26c

Merge branch 'feature/classification-unified-api' of github.com:Proje…

c04275b

…ctTech4DevAI/ai-platform into feature/classification-unified-api

nishika26 approved these changes Oct 10, 2025

View reviewed changes

updated the testcase

8801b39

coderabbitai bot reviewed Oct 10, 2025

View reviewed changes

AkhileshNegi merged commit 4813d95 into main Oct 10, 2025
3 checks passed

AkhileshNegi deleted the feature/classification-unified-api branch October 10, 2025 05:13

AkhileshNegi linked an issue Oct 24, 2025 that may be closed by this pull request

Classification: Unified API endpoint handling all finetuning - model evaluations functions internally #348

Closed

Uh oh!

Classification: Merging endpoints #391

Classification: Merging endpoints #391

Uh oh!

Conversation

AkhileshNegi commented Oct 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AkhileshNegi commented Oct 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 6, 2025 •

edited

Loading

codecov bot commented Oct 6, 2025 •

edited

Loading