Skip to content

Fix: Preserve Pydantic details when serialization fails#59401

Merged
edoakes merged 2 commits intoray-project:masterfrom
mgchoi239:fix/pydantic-validation-error-serialization
Dec 16, 2025
Merged

Fix: Preserve Pydantic details when serialization fails#59401
edoakes merged 2 commits intoray-project:masterfrom
mgchoi239:fix/pydantic-validation-error-serialization

Conversation

@mgchoi239
Copy link
Contributor

@mgchoi239 mgchoi239 commented Dec 12, 2025

Fixes #59357

Description

The Pydantic error is not serializable, making it difficult to root cause the bug. When a Pydantic ValidationError containing ArgsKwargs objects cannot be serialized, Ray now preserves the original error details including the ValidationError message and full traceback in the error message.

Details

Modified RayTaskError.init to include the full traceback_str in the error message when serialization fails, allowing users to see the original ValidationError details even when the exception itself cannot be pickled.

Testing

  • Created test script that verifies original error details are preserved
  • Confirmed wrapped error is serializable
  • Verified original ValidationError information survives round-trip serialization

@mgchoi239 mgchoi239 requested a review from a team as a code owner December 12, 2025 07:31
@mgchoi239 mgchoi239 force-pushed the fix/pydantic-validation-error-serialization branch 4 times, most recently from 83d2a7b to 9c1916b Compare December 12, 2025 08:25
@ray-gardener ray-gardener bot added core Issues that should be addressed in Ray Core community-contribution Contributed by the community labels Dec 12, 2025
@edoakes edoakes added the go add ONLY when ready to merge, run all tests label Dec 12, 2025
Copy link
Collaborator

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just one minor comment

Comment on lines 148 to 151
logger.warning(
f"The original cause of the RayTaskError ({err_type}) "
f"isn't serializable: {e}. Preserving error details."
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.warning(
f"The original cause of the RayTaskError ({err_type}) "
f"isn't serializable: {e}. Preserving error details."
)
logger.exception(
f"The original cause of the RayTaskError ({err_type}) couldn't be serialized."
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for "preserving error details", and logger.exception will include and format stack trace automatically

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Pushed the new change

@edoakes
Copy link
Collaborator

edoakes commented Dec 12, 2025

@eicherseiji PTAL to help validate this solves your original issue

…ion fails

Fixes ray-project#59357

## Description
The Pydantic error is not serializable, making it difficult to root cause the bug. When a Pydantic ValidationError containing ArgsKwargs objects cannot be serialized, Ray now preserves the original error details including the ValidationError message and full traceback in the error message.

## Details
Modified RayTaskError.__init__ to include the full traceback_str in the error message when serialization fails, allowing users to see the original ValidationError details even when the exception itself cannot be pickled.

Signed-off-by: mgchoi239 <mg.choi.239@example.com>
@mgchoi239 mgchoi239 force-pushed the fix/pydantic-validation-error-serialization branch from 58b7869 to 8285b9f Compare December 12, 2025 18:37
@eicherseiji
Copy link
Contributor

Hi @mgchoi239, thanks for the PR! The change works well. However, further investigation showed that #59357 root cause was missing ServeController logs. When the ServeController logs are present, Serve correctly surfaces the Pydantic validation error.

In particular, it's not immediately clear why the ServeController logs were missing from the Ray service stdout in the KubeRay setup used for #59357, and launched identically with python serve.py.

tl;dr: in the case where ServeController logs are for some reason missing, this change would immediately solve my debugging issue. But in general, users should not need this.

@edoakes with that context I will let you make the call on this change. I will update the original issue.

@mgchoi239 mgchoi239 force-pushed the fix/pydantic-validation-error-serialization branch from 3620183 to 508d609 Compare December 13, 2025 18:24
@edoakes
Copy link
Collaborator

edoakes commented Dec 15, 2025

There was a test failure, likely due to the exception message changing: https://buildkite.com/ray-project/premerge/builds/55778#019b1927-7b54-4c26-8738-eb1f9117a6ae/622-1360

@mgchoi239 mgchoi239 force-pushed the fix/pydantic-validation-error-serialization branch from 9290aca to eb6dc29 Compare December 16, 2025 07:34
@edoakes edoakes merged commit d6b690f into ray-project:master Dec 16, 2025
6 checks passed
cszhu pushed a commit that referenced this pull request Dec 17, 2025
Fixes #59357

## Description
The Pydantic error is not serializable, making it difficult to root
cause the bug. When a Pydantic ValidationError containing ArgsKwargs
objects cannot be serialized, Ray now preserves the original error
details including the ValidationError message and full traceback in the
error message.

## Details
Modified RayTaskError.__init__ to include the full traceback_str in the
error message when serialization fails, allowing users to see the
original ValidationError details even when the exception itself cannot
be pickled.

## Testing
- Created test script that verifies original error details are preserved
- Confirmed wrapped error is serializable
- Verified original ValidationError information survives round-trip
serialization

Signed-off-by: mgchoi239 <mg.choi.239@example.com>
Co-authored-by: MG <mg@MGs-MacBook-Air.local>
Co-authored-by: mgchoi239 <mg.choi.239@example.com>
zzchun pushed a commit to zzchun/ray that referenced this pull request Dec 18, 2025
…59401)

Fixes ray-project#59357

## Description
The Pydantic error is not serializable, making it difficult to root
cause the bug. When a Pydantic ValidationError containing ArgsKwargs
objects cannot be serialized, Ray now preserves the original error
details including the ValidationError message and full traceback in the
error message.

## Details
Modified RayTaskError.__init__ to include the full traceback_str in the
error message when serialization fails, allowing users to see the
original ValidationError details even when the exception itself cannot
be pickled.

## Testing
- Created test script that verifies original error details are preserved
- Confirmed wrapped error is serializable
- Verified original ValidationError information survives round-trip
serialization

Signed-off-by: mgchoi239 <mg.choi.239@example.com>
Co-authored-by: MG <mg@MGs-MacBook-Air.local>
Co-authored-by: mgchoi239 <mg.choi.239@example.com>
Yicheng-Lu-llll pushed a commit to Yicheng-Lu-llll/ray that referenced this pull request Dec 22, 2025
…59401)

Fixes ray-project#59357

## Description
The Pydantic error is not serializable, making it difficult to root
cause the bug. When a Pydantic ValidationError containing ArgsKwargs
objects cannot be serialized, Ray now preserves the original error
details including the ValidationError message and full traceback in the
error message.

## Details
Modified RayTaskError.__init__ to include the full traceback_str in the
error message when serialization fails, allowing users to see the
original ValidationError details even when the exception itself cannot
be pickled.

## Testing
- Created test script that verifies original error details are preserved
- Confirmed wrapped error is serializable
- Verified original ValidationError information survives round-trip
serialization

Signed-off-by: mgchoi239 <mg.choi.239@example.com>
Co-authored-by: MG <mg@MGs-MacBook-Air.local>
Co-authored-by: mgchoi239 <mg.choi.239@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[serve][kuberay] ServeController logs missing, replica RuntimeError not surfaced

3 participants