【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #668

VOIDMalkuth · 2023-09-28T04:07:15Z

为 Paddle 新增 slice_scatter API

paddle-bot · 2023-09-28T04:07:19Z

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

CLAassistant · 2023-09-28T04:07:20Z

All committers have signed the CLA.

VOIDMalkuth · 2023-10-03T05:15:48Z

文档中要求slice_scatter API的实现放在python/paddle/tensor/manipulation.py中，因此该算子可以在Python层用代码实现，且Python API的开发更便捷，本RFC将实现方式定为Python API组合。但是python/paddle/tensor/manipulation.py中的其他API均调用C++代码的算子库，为了一致性可能使用C++实现比较合适。想请问老师对于这两种实现手段有何看法，关于C++和Python API到底选用那种方式更合适。

zoooo0820 · 2023-10-11T06:41:57Z

rfcs/APIs/20230928_api_design_for_slice_scatter.md

+- dim: (int) 将要插入src slice的维度。
+- start: (Optional[int]) 待插入slice位置的起始index。
+- end: (Optional[int]) 待插入slice位置的结束index。
+- step: (int) 待插入slice的步长。


辛苦参考下命名规范https://www.paddlepaddle.org.cn/documentation/docs/zh/dev_guides/api_contributing_guides/api_design_guidelines_standard_cn.html 调整下API参数命名

已经修改

zoooo0820 · 2023-10-11T06:43:39Z

rfcs/APIs/20230928_api_design_for_slice_scatter.md

+## API实现方案
+
+- 使用`paddle.Tensor.clone`拷贝一个Tensor
+- 使用PythonAPI组合，以索引或切片方式在新Tensor上赋值以完成功能。


不建议直接使用索引切片的方式实现，首先有额外的不必要解析耗时，其次索引赋值是原地的（因此这里才提到需要clone)，建议参考set_value OP

这里的方案本质上是：

a = src.clone() a[...,start:end:step,...] = input

事实上Torch的实现也比较类似，但是Torch中是使用Aten在C++中实现的。

set_value我注意到文档中写道This API is ONLY available in Dygraph mode，似乎和我们的要求不符？想麻烦老师说明如何参考。

a = src.clone() a[...,start:end:step,...] = input

这个写法底层的本质仍然是调用的set_value OP，这个目前没暴露成api，可以参考下 OP定义https://github.com/PaddlePaddle/Paddle/blob/864779c75bc9d57a67442935c49e7903943edbc0/paddle/phi/kernels/cpu/set_value_kernel.cc#L15

以及调用方法 https://github.com/PaddlePaddle/Paddle/blob/864779c75bc9d57a67442935c49e7903943edbc0/python/paddle/base/variable_index.py#L555

收到，我去看一看

我看了一下，set_value OP其实完成的就是将Tensor的某些位置赋值给另一个Tensor。示例代码这里是inplace完成的。

老师的意思是，我们实现时仿照_setitem_impl_函数，通过set_value的C++ OP调用完成，同时取消inplace设置么？这样做应该就可以省去老师所说的解析开销？

set_value的C++ OP调用完成，同时取消inplace设置么？这样做应该就可以省去老师所说的解析开销？

是的，解析开销主要指从x[:] = 1的调用形式，逐步确认要调用哪个OP，以及准备这个OP的输入。这里由于我们已经知道应该调用哪个OP了，直接调用即可。此外，静态图还是需要append_op的方式，这个请参考上述代码或其他API的写法

zoooo0820 · 2023-10-20T02:57:28Z

rfcs/APIs/20230928_api_design_for_slice_scatter.md

 - start: (Optional[int]) 待插入slice位置的起始index。
- end: (Optional[int]) 待插入slice位置的结束index。
+- stop: (Optional[int]) 待插入slice位置的结束index。


因为和slice是对偶API，参数名建议和slice对齐；
此外x, y表示两个对等的参数。例如add(x, y)，这里使用value更合适

此处还需要修改下

此外，辛苦补充下pytorch对应API是否要求value的shape与input取出的情况一致呢？

好的，我再调研补充一下，抱歉之前可能没看到这条review

关于pytorch对应API是否要求value的shape与input取出的情况一致的问题，其算子中TORCH_CHECK(slice.sizes() == src.sizes(), "expected src to have a size equal to the slice of self. src size = ", src.sizes(), ", slice size = ", slice.sizes());这行代码会检测value的shape和input的切片的一致性，因此是需要一致的

因为和slice是对偶API，参数名建议和slice对齐；此外x, y表示两个对等的参数。例如add(x, y)，这里使用value更合适

是指将参数名变为x, value, axis, start, end, step么？此处我选择用了单数，因为slice的输入可以有多个，而这里只有一个输入。

是指将参数名变为x, value, axis, start, end, step么？此处我选择用了单数，因为slice的输入可以有多个，而这里只有一个输入。

这个地方，建议除了将y改成value以外。由于set_value算子本身和slice一样，支持多个轴，多个start等输入，个人觉得从对偶的角度，和slice一样支持多输入更好。此时也能覆盖目前的单输入场景

是指将参数名变为x, value, axis, start, end, step么？此处我选择用了单数，因为slice的输入可以有多个，而这里只有一个输入。

这个地方，建议除了将y改成value以外。由于set_value算子本身和slice一样，支持多个轴，多个start等输入，个人觉得从对偶的角度，和slice一样支持多输入更好。此时也能覆盖目前的单输入场景

如果要将后面的输入作为多个的话，是指将value的值放入多次切片后的位置（类似二维张量中的二维区域）么？这里我主要出于与torch对齐选择了当前的API格式，如果修改的话需要考虑对单输入特殊处理么。

是指将参数名变为x, value, axis, start, end, step么？此处我选择用了单数，因为slice的输入可以有多个，而这里只有一个输入。

这个地方，建议除了将y改成value以外。由于set_value算子本身和slice一样，支持多个轴，多个start等输入，个人觉得从对偶的角度，和slice一样支持多输入更好。此时也能覆盖目前的单输入场景

如果要将后面的输入作为多个的话，是指将value的值放入多次切片后的位置（类似二维张量中的二维区域）么？

是这样的语义。

目前看起来，可以将axes, starts等和paddle.slice一样以list类型作为输入，单输入只是List里只有一个元素而多输入是有多个元素即可。

我看了一下，我目前觉得可能还是单输入更好。一是多输入情况下我们其实仍要完成和set item有一部分重复的解析和判断操作（输入list内各参数的解析与判断，axes的合法性判断，多重输入判断等等），实现起来过于复杂。二是诸如pytorch和mindspore等其他框架都使用这一格式，我觉得api如果能够尽量与其他框架一致会更好。

zoooo0820 · 2023-10-20T03:01:17Z

rfcs/APIs/20230928_api_design_for_slice_scatter.md

+## API实现方案
+
+- 使用`paddle.Tensor.clone`拷贝一个Tensor
+- 使用PythonAPI组合，以索引或切片方式在新Tensor上赋值以完成功能。


set_value的C++ OP调用完成，同时取消inplace设置么？这样做应该就可以省去老师所说的解析开销？

是的，解析开销主要指从x[:] = 1的调用形式，逐步确认要调用哪个OP，以及准备这个OP的输入。这里由于我们已经知道应该调用哪个OP了，直接调用即可。此外，静态图还是需要append_op的方式，这个请参考上述代码或其他API的写法

luotao1 · 2023-11-07T04:13:43Z

@VOIDMalkuth 请问审核人的意见都改完了么，麻烦评论一下

VOIDMalkuth · 2023-11-07T04:15:26Z

@VOIDMalkuth 请问审核人的意见都改完了么，麻烦评论一下

目前还剩API的格式待讨论确定，完成后我们会at老师

VOIDMalkuth · 2023-12-14T15:33:16Z

由于个人事务较多，无法继续开发工作，希望#784 可以接替有关工作

add slice_scatter api rfc

7513086

paddle-bot bot added the contributor label Sep 28, 2023

Ligoml mentioned this pull request Sep 28, 2023

【PaddlePaddle Hackathon 5th】开源贡献个人挑战赛 PaddlePaddle/Paddle#57262

Open

luotao1 assigned luotao1 and zoooo0820 Oct 9, 2023

zoooo0820 reviewed Oct 11, 2023

View reviewed changes

update parameter name in slice_scatter rfc

9a5952f

zoooo0820 reviewed Oct 20, 2023

View reviewed changes

update implementation details part

0ec74fb

megemini mentioned this pull request Dec 13, 2023

【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #784

Merged

VOIDMalkuth closed this Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #668

【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #668

VOIDMalkuth commented Sep 28, 2023

paddle-bot bot commented Sep 28, 2023

CLAassistant commented Sep 28, 2023 •

edited

VOIDMalkuth commented Oct 3, 2023

zoooo0820 Oct 11, 2023

VOIDMalkuth Oct 13, 2023

VOIDMalkuth Oct 17, 2023

zoooo0820 Oct 11, 2023

VOIDMalkuth Oct 13, 2023

zoooo0820 Oct 16, 2023

VOIDMalkuth Oct 16, 2023

VOIDMalkuth Oct 17, 2023

zoooo0820 Oct 20, 2023

zoooo0820 Oct 20, 2023

zoooo0820 Nov 3, 2023

VOIDMalkuth Nov 3, 2023

VOIDMalkuth Nov 3, 2023

VOIDMalkuth Nov 3, 2023 •

edited

zoooo0820 Nov 6, 2023

VOIDMalkuth Nov 6, 2023

zoooo0820 Nov 6, 2023

VOIDMalkuth Nov 7, 2023

zoooo0820 Oct 20, 2023

luotao1 commented Nov 7, 2023

VOIDMalkuth commented Nov 7, 2023

VOIDMalkuth commented Dec 14, 2023

【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #668

【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #668

Conversation

VOIDMalkuth commented Sep 28, 2023

paddle-bot bot commented Sep 28, 2023

CLAassistant commented Sep 28, 2023 • edited

VOIDMalkuth commented Oct 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VOIDMalkuth Nov 3, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luotao1 commented Nov 7, 2023

VOIDMalkuth commented Nov 7, 2023

VOIDMalkuth commented Dec 14, 2023

CLAassistant commented Sep 28, 2023 •

edited

VOIDMalkuth Nov 3, 2023 •

edited