Classification : small fixes and storage related changes #365

nishika26 · 2025-09-01T18:00:08Z

Summary

Target issue is #364

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Notes

Object Storage field renamed for clarity: *_s3_url/ → *_s3_object across models, CRUD, migrations, preprocessing, and business logic.
Public API enrichment: added signed file URLs (derived from object keys) to public responses for fine-tuning datasets and evaluation predictions.
Populating the error message column with the actual error instead of just "failed during background job processing" in both the cases, fine tuning and model evaluation
Consistency fixes:

Metric key standardized: mcc → mcc_score.
Route parameter/response corrections (e.g., /fine_tuning/{fine_tuning_id}/refresh, response shape for /model_evaluation/evaluate_models/). changed input parameter's name in this endpoint from "job_id", to "fine_tuning_id"
Use get_cloud_storage(...) instead of AmazonCloudStorage(...), also using project id for the initialization of document crud
Preprocessor - storage.put now uses file_path= kwarg, according to the recent changes made in the storage.put method.

Tests (fine-tuning): Updated the preprocessor mock to return *_s3_object fields and patched the route’s Session to reuse the test db.

coderabbitai · 2025-09-01T18:20:14Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch enhancement/classification_small_fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

…_small_fixes

backend/app/api/routes/model_evaluation.py

backend/app/api/routes/fine_tuning.py

kartpop

approving with comments, please try and handle errors for all aws storage calls before merging

kartpop · 2025-09-04T05:50:43Z

backend/app/api/routes/model_evaluation.py

    )
+
    top_model = fetch_top_model_by_doc_id(session, document_id, current_user.project_id)
+    storage = get_cloud_storage(session=session, project_id=current_user.project_id)


should we not handle errors here? what if get_signed_url fails?

the way error handling is already there in the get signed url method, I will add it to the get cloud storage method as well, so that errors for this are also handled

added one more error handling to get cloud storage

kartpop · 2025-09-04T05:54:01Z

backend/app/api/routes/model_evaluation.py

+    attach a signed prediction data file URL (if available).
+    """
+    s3_key = getattr(model_obj, "prediction_data_s3_object", None)
+    prediction_data_file_url = storage.get_signed_url(s3_key) if s3_key else None


if the s3_key is None, what happens? should we send empty/null value is response json?

also maybe include error handling there, something like this

try: prediction_data_file_url = storage.get_signed_url(s3_key) except Exception as e: logger.warning(f"Failed to generate signed URL for {s3_key}: {e}") prediction_data_file_url = None

If the s3_key is None, then there’s simply no prediction file to sign, so the function will set prediction_data_file_url to None. In the API response, this will show up as null, which is fine since the ModelEvaluationPublic schema already specifies a default of None for this field.

We don’t need to add extra error handling inside attach_prediction_file_url, because any errors related to generating the signed URL are already being logged and handled in the get_signed_url method itself. This keeps the logic clean — attach_prediction_file_url just decides whether or not to attempt URL generation, while get_signed_url is responsible for managing AWS-specific failures.

…ling

* Classification: db models and migration script (#305) * db models and migration script * Classification: Fine tuning Initiation and retrieve endpoint (#315) * Fine-tuning core, initiation, and retrieval * seperate session for bg task, and formating fixes * fixing alembic revision * Classification : Model evaluation of fine tuned models (#326) * Model evaluation of fine tuned models * fixing alembic revision * alembic revision fix * Classification : train and test data to s3 (#343) * alembic file for adding and removing columns * train and test s3 url column * updating alembic revision * formatting fix * Classification : retaining prediction and fetching data from s3 for model evaluation (#359) * adding new columns to model eval table * test data and prediction data s3 url changes * single migration file * status enum columns * document seeding * Classification : small fixes and storage related changes (#365) * first commit covering all * changing model name to fine tuned model in model eval * error handling in get cloud storage and document not found error handling * fixing alembic revision * uv lock * new uv lock file * updated uv lock file * coderabbit suggestions and removing unused imports * changes in uv lock file * making csv a supported file format, changing uv lock and pyproject toml

first commit covering all

c729531

nishika26 changed the title ~~first commit covering all~~ Classification : small fixes and storage related changes Sep 2, 2025

nishika26 self-assigned this Sep 2, 2025

nishika26 added the enhancement New feature or request label Sep 2, 2025

nishika26 linked an issue Sep 2, 2025 that may be closed by this pull request

Classification immediate fixes and enhancements #364

Closed

final fixes

a6a8146

nishika26 marked this pull request as ready for review September 2, 2025 07:04

nishika26 added the ready-for-review label Sep 2, 2025

nishika26 requested review from AkhileshNegi and kartpop September 2, 2025 07:21

nishika26 and others added 3 commits September 3, 2025 15:31

Merge branch 'feature/classification' into enhancement/classification…

98d58a3

…_small_fixes

Merge branch 'feature/classification' into enhancement/classification…

d7988a3

…_small_fixes

changing model name to fine tuned model in model eval

98eca50

AkhileshNegi approved these changes Sep 3, 2025

View reviewed changes

backend/app/api/routes/model_evaluation.py Outdated Show resolved Hide resolved

backend/app/api/routes/fine_tuning.py Outdated Show resolved Hide resolved

better variable names

8364401

kartpop approved these changes Sep 4, 2025

View reviewed changes

error handling in get cloud storage and document not found error hand…

bd7c676

…ling

nishika26 merged commit e9d09e3 into feature/classification Sep 4, 2025
1 check passed

nishika26 deleted the enhancement/classification_small_fixes branch September 4, 2025 08:08

Uh oh!

Classification : small fixes and storage related changes #365

Classification : small fixes and storage related changes #365

Uh oh!

Conversation

nishika26 commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Notes

Uh oh!

coderabbitai bot commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

Uh oh!

Uh oh!

kartpop left a comment

Choose a reason for hiding this comment

Uh oh!

kartpop Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

nishika26 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

nishika26 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

kartpop Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

nishika26 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nishika26 commented Sep 1, 2025 •

edited

Loading

coderabbitai bot commented Sep 1, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)