Skip to content

Conversation

@nishika26
Copy link
Collaborator

@nishika26 nishika26 commented Sep 1, 2025

Summary

Target issue is #364

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

  1. Object Storage field renamed for clarity: *_s3_url/ → *_s3_object across models, CRUD, migrations, preprocessing, and business logic.

  2. Public API enrichment: added signed file URLs (derived from object keys) to public responses for fine-tuning datasets and evaluation predictions.

  3. Populating the error message column with the actual error instead of just "failed during background job processing" in both the cases, fine tuning and model evaluation

  4. Consistency fixes:

  • Metric key standardized: mcc → mcc_score.

  • Route parameter/response corrections (e.g., /fine_tuning/{fine_tuning_id}/refresh, response shape for /model_evaluation/evaluate_models/). changed input parameter's name in this endpoint from "job_id", to "fine_tuning_id"

  • Use get_cloud_storage(...) instead of AmazonCloudStorage(...), also using project id for the initialization of document crud

  • Preprocessor - storage.put now uses file_path= kwarg, according to the recent changes made in the storage.put method.

  1. Tests (fine-tuning): Updated the preprocessor mock to return *_s3_object fields and patched the route’s Session to reuse the test db.

@coderabbitai
Copy link

coderabbitai bot commented Sep 1, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhancement/classification_small_fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@nishika26 nishika26 changed the title first commit covering all Classification : small fixes and storage related changes Sep 2, 2025
@nishika26 nishika26 self-assigned this Sep 2, 2025
@nishika26 nishika26 added the enhancement New feature or request label Sep 2, 2025
@nishika26 nishika26 linked an issue Sep 2, 2025 that may be closed by this pull request
@nishika26 nishika26 marked this pull request as ready for review September 2, 2025 07:04
Copy link
Collaborator

@kartpop kartpop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving with comments, please try and handle errors for all aws storage calls before merging

)

top_model = fetch_top_model_by_doc_id(session, document_id, current_user.project_id)
storage = get_cloud_storage(session=session, project_id=current_user.project_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not handle errors here? what if get_signed_url fails?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way error handling is already there in the get signed url method, I will add it to the get cloud storage method as well, so that errors for this are also handled

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added one more error handling to get cloud storage

attach a signed prediction data file URL (if available).
"""
s3_key = getattr(model_obj, "prediction_data_s3_object", None)
prediction_data_file_url = storage.get_signed_url(s3_key) if s3_key else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the s3_key is None, what happens? should we send empty/null value is response json?

also maybe include error handling there, something like this

    try:
        prediction_data_file_url = storage.get_signed_url(s3_key)
    except Exception as e:
        logger.warning(f"Failed to generate signed URL for {s3_key}: {e}")
        prediction_data_file_url = None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the s3_key is None, then there’s simply no prediction file to sign, so the function will set prediction_data_file_url to None. In the API response, this will show up as null, which is fine since the ModelEvaluationPublic schema already specifies a default of None for this field.

We don’t need to add extra error handling inside attach_prediction_file_url, because any errors related to generating the signed URL are already being logged and handled in the get_signed_url method itself. This keeps the logic clean — attach_prediction_file_url just decides whether or not to attempt URL generation, while get_signed_url is responsible for managing AWS-specific failures.

@nishika26 nishika26 merged commit e9d09e3 into feature/classification Sep 4, 2025
1 check passed
@nishika26 nishika26 deleted the enhancement/classification_small_fixes branch September 4, 2025 08:08
AkhileshNegi pushed a commit that referenced this pull request Sep 4, 2025
* Classification: db models and migration script (#305)

* db models and migration script

* Classification: Fine tuning Initiation and retrieve endpoint (#315)

* Fine-tuning core, initiation, and retrieval

* seperate session for bg task, and formating fixes

* fixing alembic revision

* Classification : Model evaluation of fine tuned models (#326)

* Model evaluation of fine tuned models

* fixing alembic revision

* alembic revision fix

* Classification : train and test data to s3 (#343)

* alembic file for adding and removing columns

* train and test s3 url column

* updating alembic revision

* formatting fix

* Classification : retaining prediction and fetching data from s3 for model evaluation (#359)

* adding new columns to model eval table

* test data and prediction data s3 url changes

* single migration file

* status enum columns

* document seeding

* Classification : small fixes and storage related changes (#365)

* first commit covering all

* changing model name to fine tuned model in model eval

* error handling in get cloud storage and document not found error handling

* fixing alembic revision

* uv lock

* new uv lock file

* updated uv lock file

* coderabbit suggestions and removing unused imports

* changes in uv lock file

* making csv a supported file format, changing uv lock and pyproject toml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Classification immediate fixes and enhancements

3 participants