[ENH]: Make cuda_bindings UnicodeDecodeError more actionable

## Motivation

When CUDA Python receives a string from an underlying CUDA library, we generally expect it to decode cleanly into a Python `str`. When that assumption is wrong, the resulting `UnicodeDecodeError` is often technically accurate but not very actionable.

For users, the important question is not just "which byte failed to decode?" It is also:

- Which CUDA API returned the non-decodable string?
- What bytes did the CUDA library actually return?
- Is there any readable fragment in the returned data that helps identify the underlying issue?

PR #2118 exposed this gap while investigating a WSL/NVML process-name issue. The root cause there is handled separately, but the debugging experience showed that our current error message leaves out the information most useful for diagnosing driver/library string issues.

Today, if a CUDA library returns bytes that cannot be decoded as UTF-8, Python's default `UnicodeDecodeError` reports the codec, the first bad byte, and the byte offset. That is useful, but incomplete.

For example, an error like this:

```text
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte
```

does not say which CUDA API produced the bytes, nor does it show the returned buffer. Without that context, users and maintainers have to add ad hoc instrumentation or repro scripts before they can tell whether the CUDA library returned garbage, a string in an unexpected encoding, an unterminated buffer, or some other invalid data.

This makes rare string-decoding failures much harder to report, triage, and escalate to the owning CUDA library team.

## Suggested enhancement

Add a small internal helper for decoding C strings returned by CUDA libraries. The helper should keep the successful path unchanged: if the bytes decode normally, return the same Python `str` as today.

On decode failure, the helper should raise an error with additional context, including at least:

- The CUDA API or binding function that produced the string.
- The original decode failure.
- A bounded representation of the returned bytes, such as a capped `repr` or hex dump up to the first NUL byte or a fixed maximum length.

The goal is better diagnostics, not silent recovery. The helper should not guess encodings, run expensive detection, or return replacement-decoded strings. It should make failures easier to understand while preserving existing behavior for valid strings.

This can start with the string-returning APIs where we already convert C buffers to Python strings, and can be expanded incrementally as needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH]: Make cuda_bindings UnicodeDecodeError more actionable #2122

Motivation

Suggested enhancement

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[ENH]: Make cuda_bindings UnicodeDecodeError more actionable #2122

Description

Motivation

Suggested enhancement

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions