add optional `tensor_slice_spec` arg to `get()` API #32

kaiyuan-li · 2025-09-10T15:11:48Z

Added an optional tensor_slice_spec to the get() API so we can get a slice instead of the full tensor.
Test added for cross process get() with two actors and inplace get()

fix dtensor related test (test_sharding.py)
add test for dtensor get (with slice)
optimize dtensor slice get
a. avoid constructing the whole global tensor
b. avoid unnecessary volumes without tensor slice of interest
c. batch tensor slice query into volumes

LucasLLC · 2025-09-10T17:42:46Z

torchstore/client.py

                # but for now any should work.
                fetched_tensor = await pipe.get_from_storage_volume(key, request)
+
+                # If user requested a specific slice, extract it


curious if it's possible to throw an exception in the storage layer? Does the request already have this information?

I thought for now we assume request doesn't know such info and blindly fetches the whole tensor.

LucasLLC · 2025-09-10T17:43:22Z

torchstore/client.py

+
+                    # Handle in-place operation for tensor slice
+                    if inplace_tensor is not None:
+                        inplace_tensor.copy_(sliced_tensor)


nit: I forgot to add this i previous pr, but it's slightly cleaner to return inplace_tensor.copy_(sliced_tensor)

Good to learn (and I checked pytorch doc that copy_() returns self). Done.

LucasLLC · 2025-09-10T17:45:33Z

torchstore/client.py

+        self,
+        key: str,
+        inplace_tensor: Optional[torch.Tensor] = None,
+        tensor_slice_spec: Optional[TensorSlice] = None,


As written, we should assert either inplace tensor or tensor slice spec is None. Both being set should raise

Ah nvm, in this case we should assert inplace_tensor.shape == tensor_slice_spec.shape

Validation added to raise ValueError if both are specified and there's a mismatch between the shapes.

LucasLLC · 2025-09-10T17:45:45Z

torchstore/client.py

        logger.debug(f"Fetching {key}")
-        request = Request.from_any(inplace_tensor)
+
+        # When slicing, don't use inplace_tensor for the request because the transport


I'm not sure I understand this comment -- does it simplify here if we assume one is None?

I guess the change here is because this Request always fetches the whole tensor. Before, there's no sliced option so inplace_tensor is either None or the whole tensor. Now there's is sliced option, but at least for now we always want the request to be with whole tensor.

Another question I have - do we want the pipe to know that we are just fetching a slice or not. At this moment, I was trying to assume that pipe always fetches the whole tensor. That's why the slice buffer is not put into Request. This is simple and clean but NOT efficient. Maybe we can improve the efficiency later by making Request also slice-aware. What do you think?

I think actually request already has the ability to fetch a single tensor slice... this is how the tensor_slice_request work below.

torchstore/client.py

LucasLLC · 2025-09-10T18:44:54Z

tests/test_store.py

+
+
+@pytest.mark.asyncio
+async def test_tensor_slice_inplace():


can we add a test such that we call put on a dtensor, and then call get with no tensor slice and no dtensor? The result should be the entire tensor

Test added. Also factored out DTensorActor out from test_sharding.py into utils.py for reuse in test_store.py.

kaiyuan-li

@LucasLLC thanks for the review. Those a good comments which helped me have a better understanding of the architecture and workflow. Please take another look :)

kaiyuan-li · 2025-09-11T15:50:25Z

torchstore/client.py

+        self,
+        key: str,
+        inplace_tensor: Optional[torch.Tensor] = None,
+        tensor_slice_spec: Optional[TensorSlice] = None,


Validation added to raise ValueError if both are specified and there's a mismatch between the shapes.

kaiyuan-li · 2025-09-11T16:06:21Z

torchstore/client.py

        logger.debug(f"Fetching {key}")
-        request = Request.from_any(inplace_tensor)
+
+        # When slicing, don't use inplace_tensor for the request because the transport


I guess the change here is because this Request always fetches the whole tensor. Before, there's no sliced option so inplace_tensor is either None or the whole tensor. Now there's is sliced option, but at least for now we always want the request to be with whole tensor.

Another question I have - do we want the pipe to know that we are just fetching a slice or not. At this moment, I was trying to assume that pipe always fetches the whole tensor. That's why the slice buffer is not put into Request. This is simple and clean but NOT efficient. Maybe we can improve the efficiency later by making Request also slice-aware. What do you think?

kaiyuan-li · 2025-09-11T16:08:29Z

torchstore/client.py

                # but for now any should work.
                fetched_tensor = await pipe.get_from_storage_volume(key, request)
+
+                # If user requested a specific slice, extract it


I thought for now we assume request doesn't know such info and blindly fetches the whole tensor.

kaiyuan-li · 2025-09-11T16:08:59Z

torchstore/client.py

+
+                    # Handle in-place operation for tensor slice
+                    if inplace_tensor is not None:
+                        inplace_tensor.copy_(sliced_tensor)


Good to learn (and I checked pytorch doc that copy_() returns self). Done.

kaiyuan-li · 2025-09-11T16:58:14Z

tests/test_store.py

+
+
+@pytest.mark.asyncio
+async def test_tensor_slice_inplace():


Test added. Also factored out DTensorActor out from test_sharding.py into utils.py for reuse in test_store.py.

torchstore/client.py

LucasLLC

From my understanding, the main difference between dtensor and tensor slice is a tensor slice does not have a coordinate. iiuc this was a requirement at some point during put, but should not be required during gets.

I'm curious if we create the request from the tensor slice, if the implementation would be simplified since we could use the same code paths as we do for dtensor?

LucasLLC · 2025-09-12T20:00:58Z

tests/test_store.py

+            if slice_spec is None:
+                return await ts.get(key)
+            else:
+                return await ts.get(key, tensor_slice_spec=slice_spec)


In the future we'll want to raise as well if the tensor slice does not exist / is invalid

LucasLLC · 2025-09-12T20:02:53Z

tests/test_store.py

+        volume_world_size,
+        TensorSliceGetActor,
+        "tensor_slice_get_actors",
+        world_size=volume_world_size,


Why spawn volume_world_size actors if we're slicing to single node below?

Make sense. This was mostly from copying existing test example, lol. It doesn't look necessary. Updated to a direct ts.get().

LucasLLC · 2025-09-12T20:03:23Z

tests/test_store.py

+    # Initialize TorchStore with 2 storage volumes and LocalRankStrategy
+    from torchstore.strategy import LocalRankStrategy
+
+    await ts.initialize(num_storage_volumes=2, strategy=LocalRankStrategy())


nit: ts.LocalRankStrategy instead of import

LucasLLC · 2025-09-12T20:05:11Z

tests/test_store.py

+                DTensorActor,
+                "dtensor_get_mesh",
+                mesh_shape=(2,),
+                original_tensor=torch.zeros(4, 4).float(),


nit: -> torch.zeros_like(original_tensor)

LucasLLC · 2025-09-12T20:06:23Z

tests/test_store.py



+@pytest.mark.asyncio
+async def test_dtensor_simple_put_get():


I'm confused by this test. Why is this needed outside of test_resharding? As a side note it also does not test correctness (would also suggest changing the placement dim).

I misunderstood your comment in the last iteration. Updated this test to put a dtensor and just fetch it with get('key'). Also verified that the result matches original tensor.

LucasLLC · 2025-09-12T20:08:18Z

torchstore/client.py

-        request = Request.from_any(inplace_tensor)
+
+        if tensor_slice_spec is not None and inplace_tensor is not None:
+            if tensor_slice_spec.local_shape != inplace_tensor.shape:


if inplace_tensor is a dtensor we should assert on offset as well?

Added validation that tensor_slice should be None if inplace_tensor is a DTensor.

LucasLLC · 2025-09-12T20:14:06Z

torchstore/client.py

        logger.debug(f"Fetching {key}")
-        request = Request.from_any(inplace_tensor)
+
+        # When slicing, don't use inplace_tensor for the request because the transport


I think actually request already has the ability to fetch a single tensor slice... this is how the tensor_slice_request work below.

LucasLLC · 2025-09-12T20:16:17Z

torchstore/client.py

        # multinode support here
        volume_map = await self._controller.locate_volumes.call_one(key)

+        if object_type in (ObjectType.OBJECT, ObjectType.TENSOR):


I think this if statement should only apply to ObjectType.OBJECT. Objects are the only items that are allowed to live on a single storage volume (so far).

In the past, the same was true about requesting tensors, since we assumed the only "DTensor" would request a tensor from sharded storage. Since we are allowing users to request arbitrary 'tensor slice' without a dtensor, this is not longer the case, meaning we should change this code to only account for ObjectType.OBJECT, and likely use the codepath below.

Done with a major update. Please take a look.

kaiyuan-li · 2025-09-15T19:54:53Z

Please take another look :)

LucasLLC · 2025-09-17T17:25:14Z

torchstore/api.py

 async def get(
    key: str,
    inplace_tensor: Optional[torch.Tensor] = None,
+    tensor_slice_spec: Optional[TensorSlice] = None,


please add docstring explaining the combinations which are allowed. e.g. "tensor_slice_spec + dtensor is not allowed"

LucasLLC · 2025-09-17T17:26:31Z

torchstore/client.py

+            full_tensor = await self._get_distributed_whole_tensor(key)
+
+        if isinstance(inplace_tensor, DTensor):
+            request = Request.from_any(inplace_tensor)


nit: from Dtensor

LucasLLC · 2025-09-17T17:52:50Z

tests/test_store.py

+    try:
+        # Store a test tensor
+        test_tensor = torch.randn(100, 200)
+        await ts.put("inplace_test", test_tensor)


This is a cool thing to add to readme / docs

LucasLLC · 2025-09-17T17:53:15Z

tests/test_store.py

+
+            await put_mesh.do_put.call()
+
+            fetched_tensor = await ts.get("test_key")


LucasLLC · 2025-09-17T17:53:47Z

torchstore/client.py

-        object_type = ObjectType.from_request(request)

-        # multinode support here
+        stored_object_type = await self._get_stored_object_type(key)


One additional control path query, which is probably fine but we do something similar in Pipe. (get_meta).

LucasLLC

Overall really happy with the functionality of this PR! Tysm for going through the motions.

Since this is not pressing, I would like to spend a little more time thinking through how we can best encapsulate the respondsibilities of each object - for example potentially moving some of this logic out of client into pipe.

If this ends up being blocking we can revisit, but since we have time I'd rather see if we can make the right decisions upfront! Let's schedule some time this week to go over it.

LucasLLC

approving since we need this fix

get with tensor slice

80e38ba

kaiyuan-li requested a review from LucasLLC September 10, 2025 15:11

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 10, 2025

LucasLLC reviewed Sep 10, 2025

View reviewed changes

kaiyuan-li added 2 commits September 11, 2025 09:11

address comments on client.py

f530c28

test dtensor put get

2441fd0

kaiyuan-li commented Sep 11, 2025

View reviewed changes

fix dtensor get set test

a9f0361

LucasLLC requested changes Sep 12, 2025

View reviewed changes

LucasLLC mentioned this pull request Sep 12, 2025

Add a store.get_slice() method to get tensors at certain offsets #11

Closed

update

3b748be

remove unused imports

86808d2

LucasLLC reviewed Sep 17, 2025

View reviewed changes

tests/test_store.py

await put_mesh.do_put.call()

fetched_tensor = await ts.get("test_key")

Copy link

Contributor

LucasLLC Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay!

LucasLLC reviewed Sep 17, 2025

View reviewed changes

LucasLLC requested changes Sep 17, 2025

View reviewed changes

joecummings mentioned this pull request Sep 17, 2025

GRPO Titan RL Trainer meta-pytorch/torchforge#161

Merged

pradeepfn mentioned this pull request Sep 17, 2025

Trainer/Policy Engine Re-sharding weight sync. meta-pytorch/torchforge#162

Merged

LucasLLC mentioned this pull request Sep 18, 2025

[DNL] For testing, complete controller strategy + tensor slice support #34

Closed

LucasLLC added 2 commits September 18, 2025 13:44

Merge branch 'main' into lky_tensor_slice

75665af

fix merge conflicts

fdb9f28

joecummings approved these changes Sep 18, 2025

View reviewed changes

Merge branch 'main' into lky_tensor_slice

0f686be

LucasLLC approved these changes Sep 19, 2025

View reviewed changes

LucasLLC merged commit 0f0e7d4 into main Sep 19, 2025
1 of 5 checks passed

LucasLLC deleted the lky_tensor_slice branch September 19, 2025 15:35



		@pytest.mark.asyncio
		async def test_dtensor_simple_put_get():


		await put_mesh.do_put.call()

		fetched_tensor = await ts.get("test_key")

add optional tensor_slice_spec arg to get() API #32

add optional tensor_slice_spec arg to get() API #32

Uh oh!

Conversation

kaiyuan-li commented Sep 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaiyuan-li left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LucasLLC left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaiyuan-li commented Sep 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

add optional `tensor_slice_spec` arg to `get()` API #32

add optional `tensor_slice_spec` arg to `get()` API #32