Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation validation & restart #557

Merged
merged 11 commits into from
Apr 23, 2024
Merged

Conversation

czaloom
Copy link
Collaborator

@czaloom czaloom commented Apr 21, 2024

Issue Description

Evaluations dataset/model validation is occurring within the computation causing silent failure.

Example

{"method": "compute_detection_metrics", "event": "Valor Exception: Evaluation '136'", "level": "error", "timestamp": "2024-04-19T21:24:08.293418Z", "exception": [{"exc_type": "RuntimeError", "exc_value": "Model '2fK6nXvtq9JfMnZntWiI7kLMH94_detect' does not meet filter requirements.", "syntax_error": null, "is_cause": false, "frames": [{"filename": "/src/valor_api/backend/metrics/metric_utils.py", "lineno": 292, "name": "wrapper", "line": "", "locals": {"args": "()", "kwargs": "\"{'db': <sqlalchemy.orm.session.Session object at 0x7fc75681a1d0>, 'evaluation_id\"+7", "db": "<sqlalchemy.orm.session.Session object at 0x7fc75681a1d0>", "evaluation_id": "136", "e": "'RuntimeError(\"Model \\'2fK6nXvtq9JfMnZntWiI7kLMH94_detect\\' does not meet filter re'+13", "fn": "<function compute_detection_metrics at 0x7fc759deec20>"}}, {"filename": "/src/valor_api/backend/metrics/detection.py", "lineno": 941, "name": "compute_detection_metrics", "line": "", "locals": {"db": "<sqlalchemy.orm.session.Session object at 0x7fc75681a1d0>", "evaluation_id": "136", "_": "()", "evaluation": "<valor_api.backend.models.Evaluation object at 0x7fc756818fa0>", "groundtruth_filter": "\"Filter(dataset_names=['2V2Z9CNQHCjuYu0R0XLCzFDlfD1_Object_Detection'], dataset_m\"+374", "prediction_filter": "\"Filter(dataset_names=['2V2Z9CNQHCjuYu0R0XLCzFDlfD1_Object_Detection'], dataset_m\"+408", "parameters": "\"EvaluationParameters(task_type=<TaskType.OBJECT_DETECTION: 'object-detection'>, \"+251", "datasets": "\"[(34, '2V2Z9CNQHCjuYu0R0XLCzFDlfD1_Object_Detection', {'task': 'Object Detection\"+376", "model": "None"}}]}]}

Expected Behavior

If a dataset or model has no data that conforms to the evaluation filter, then the job should return as Done with no metrics.

Other Additions

  • Evaluation jobs with Failed statuses will restart if queried.

@czaloom czaloom linked an issue Apr 21, 2024 that may be closed by this pull request
1 task
@czaloom czaloom marked this pull request as ready for review April 21, 2024 17:46
Copy link
Contributor

@rsbowman-striveworks rsbowman-striveworks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to mark evaluations with no data as DONE makes sense to me. I'm open to the other changes but less sure they're the right thing to do.

api/valor_api/crud/_create.py Show resolved Hide resolved
api/valor_api/backend/core/evaluation.py Outdated Show resolved Hide resolved
ntlind
ntlind previously approved these changes Apr 22, 2024
@ntlind ntlind dismissed their stale review April 22, 2024 15:18

waiting for thoughts on Sean's comments

@czaloom czaloom merged commit 0f8f727 into main Apr 23, 2024
10 checks passed
@czaloom czaloom deleted the czaloom-validate-jobs-before-compute branch April 23, 2024 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Evaluations over empty sets should give response code
3 participants