cuda.core: expose mem-range attributes on ManagedBuffer (last_prefetch_location and read-side getters)

## Background

PR #1775 lands `ManagedBuffer` with a property-style advice API (`buf.read_mostly = ...`, `buf.preferred_location = ...`, `buf.accessed_by.add(...)`) for the *write* side of managed-memory advice. The corresponding *read* side is uneven:

- `buf.preferred_location` — exists, returns `Device | Host | None`.
- `buf.read_mostly` — exists as a getter (queries `CU_MEM_RANGE_ATTRIBUTE_READ_MOSTLY`).
- `buf.accessed_by` — exists as `AccessedBySetProxy`.
- **`last_prefetch_location` — missing.**

Because the last-prefetch query isn't exposed, the test suite reaches into `cuda.bindings.driver` directly:

```python
last = _get_int_attr(buf, driver.CUmem_range_attribute.CU_MEM_RANGE_ATTRIBUTE_LAST_PREFETCH_LOCATION)
```

As @leofang noted on PR #1775 (https://github.com/NVIDIA/cuda-python/pull/1775#discussion_r3250979676):

> The fact that these are needed at test time rings a bell. `cuda.core` tries hard to not leak the abstraction. This highlights a problem that we do not expose enough mem-range attributes for `ManagedBuffer`.

PR #1775 currently works around this with a private `_last_prefetch_location(buf)` helper in `tests/memory/test_managed_ops.py` carrying a TODO that points at this issue.

## Proposal

Add `ManagedBuffer.last_prefetch_location` mirroring `preferred_location`'s shape:

```python
@property
def last_prefetch_location(self) -> Device | Host | None:
    """Location of the most recent prefetch on this range, or ``None``
    if no prefetch has been issued.
    """
```

Returns:
- `Device(i)` for `i >= 0`
- `Host()` for the legacy `-1` ordinal
- `None` for the "no prefetch yet" sentinel

On CUDA 13, verify whether `CU_MEM_RANGE_ATTRIBUTE_LAST_PREFETCH_LOCATION_TYPE` / `_ID` exist; if they do, layer a v2 path the same way `preferred_location` does so `Host(numa_id=N)` round-trips. Otherwise document the legacy-attribute caveat consistently with `preferred_location`.

## Follow-on cleanup

Once this lands, in `cuda_core/tests/memory/test_managed_ops.py`:
- Drop the private `_last_prefetch_location(buf)` helper.
- Replace `last == _HOST_LOCATION_ID` / `last == device.device_id` assertions with `buf.last_prefetch_location == Host()` / `... == device`.
- Drop the `from cuda.core._memory._managed_buffer import _get_int_attr` import and most `driver.CUmem_range_attribute.*` references.

## Scope notes

- Out of scope: the rest of `CUmem_range_attribute` (e.g., `PAGE_ENGINE_LAST_GPU_USED` and friends) — file separately if needed.
- This is the minimum needed to close PR #1775's abstraction-leak comment; everything else can come in follow-ups.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda.core: expose mem-range attributes on ManagedBuffer (last_prefetch_location and read-side getters) #2109

Background

Proposal

Follow-on cleanup

Scope notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

cuda.core: expose mem-range attributes on ManagedBuffer (last_prefetch_location and read-side getters) #2109

Description

Background

Proposal

Follow-on cleanup

Scope notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions