Skip to content

Update RFC-0002 to remove unused fields for implementation#8

Merged
mprahl merged 1 commit intomlflow:mainfrom
mprahl:update-exec-plugin
Apr 14, 2026
Merged

Update RFC-0002 to remove unused fields for implementation#8
mprahl merged 1 commit intomlflow:mainfrom
mprahl:update-exec-plugin

Conversation

@mprahl
Copy link
Copy Markdown
Collaborator

@mprahl mprahl commented Apr 7, 2026

error_message does not need to be a new database schema column. Reuse
the existing result field as the current implementation does.

Additionally, the holder_id in the scheduler lease will not be used.

@mprahl
Copy link
Copy Markdown
Collaborator Author

mprahl commented Apr 7, 2026

@B-Step62 @TomeHirata could you please take a look? I started on the database schema and proto changes implementation and saw some unnecessary fields.

status: JobStatus # executors return only terminal statuses from wait_for_job()
result: str | None = None
error_message: str | None = None
result: str | None = None # canonical terminal payload for both success and failure/timeout outcomes
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a json string, correct? What was the reason to make it str instead of dict[str, Any] in the dataclass?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current dataclass has it as a string in mlflow/server/jobs/utils.py:

@dataclass
class JobResult:
    succeeded: bool
    result: str | None = None  # serialized JSON string
    is_transient_error: bool | None = None
    error: str | None = None

@TomeHirata I can change it but existing code expects a string.

job_id: str,
status: JobStatus,
result: str | None = None,
error_message: str | None = None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to remove error_message from this API? We still have error message in JobResult

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially leaning toward removing error_message here because the current database model does not have a separate error_message column and instead uses result for both success and failure payloads. After thinking it through more, I think it still makes sense to keep error_message on the API surface, since it represents a different semantic field even if storage normalizes both cases into the same persisted column.

Copy link
Copy Markdown
Collaborator

@TomeHirata TomeHirata Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it makes more sense to separate result from error message. Result should be ideally an outcome of the successful jobs.

@mprahl mprahl force-pushed the update-exec-plugin branch from 323355a to a25ad1e Compare April 8, 2026 13:17
@mprahl mprahl requested a review from TomeHirata April 8, 2026 13:17
@mprahl
Copy link
Copy Markdown
Collaborator Author

mprahl commented Apr 8, 2026

@TomeHirata updated!

error_message does not need to be a new database schema column. Reuse
the existing result field as the current implementation does.

Additionally, the holder_id in the scheduler lease will not be used.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
@mprahl mprahl force-pushed the update-exec-plugin branch 2 times, most recently from 7575adf to aa34822 Compare April 8, 2026 13:20
Copy link
Copy Markdown
Collaborator

@TomeHirata TomeHirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants