-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix cumsum op for API 2.0, optimize performance test=develop #25505
Conversation
Thanks for your contribution! |
python/paddle/tensor/math.py
Outdated
""" | ||
:alias_main: paddle.cumsum | ||
:alias: paddle.cumsum,paddle.tensor.cumsum,paddle.tensor.math.cumsum | ||
:old_api: paddle.fluid.layers.cumsum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除这行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -22,7 +22,14 @@ class CumOp : public framework::OperatorWithKernel { | |||
using framework::OperatorWithKernel::OperatorWithKernel; | |||
|
|||
void InferShape(framework::InferShapeContext *ctx) const override { | |||
ctx->SetOutputDim("Out", ctx->GetInputDim("X")); | |||
if (ctx->Attrs().Get<bool>("flatten")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
使用到了新增属性,这里是否会破坏兼容性?
考虑下1.8模型训练后保存inference_model,然后用2.0进行预测的场景。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
考虑的场景只是1.8保存的静态图模型能不能用2.0静态图预测吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
自测不会破坏兼容性
@@ -37,6 +44,10 @@ class CumsumOpMaker : public framework::OpProtoAndCheckerMaker { | |||
"dimension [default -1].") | |||
.SetDefault(-1) | |||
.EqualGreaterThan(-1); | |||
AddAttr<bool>("flatten", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
新增属性,是否能保持兼容?
考虑下1.8模型训练后保存inference_model,然后用2.0进行预测的场景。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
自测不会破坏兼容性
paddle/fluid/operators/cumsum_op.cu
Outdated
// size of the ‘axis’ dimension. Invalid in reverse case because the thrust | ||
// APIs do not support. | ||
if (size == out_dims[axis] && !reverse) { | ||
if (exclusive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
反向时是否有必要添加CUDA Kernel,以提升速度?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已添加反向时的CUDA kernel,可加速1691倍
python/paddle/tensor/math.py
Outdated
The cumulative sum of the elements along a given axis. The first element of the result is the same of the first element of the input. | ||
|
||
Args: | ||
x (Variable): Input of cumsum operator, the Tensor/LoDTensor needed to be cumsumed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable->Tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
python/paddle/tensor/math.py
Outdated
name(str, optional): Normally there is no need for user to set this property. For more information, please refer to :ref:`api_guide_Name`. The default value is None. | ||
|
||
Returns: | ||
Variable(Tensor/LoDTensor): The result of cumsum operator, output of cumsum operator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable->Tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tensor, the result of cumsum operator, output of cumsum operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
可以的。需要运行backward=True和backward=False两种case,反向的耗时需要换算一下,把backward的耗时减去前向的耗时就是反向的耗时。 import paddle
from paddle.imperative import to_variable
import numpy as np
import paddle.fluid.dygraph as dg
import paddle.fluid as fluid
import paddle.fluid.layers as F
import paddle.fluid.profiler as profiler
def test_speed(num_epochs=5, axis=0, backward=False):
num_class = 3
probas_data = np.random.random((2, 3, 1000, 1000))
probas_data = probas_data.astype(np.float32)
labels_data = np.random.randint(0, num_class, [1, 1], dtype='int32')
probas_shape = [3, 3, 3]
labels_shape = list(labels_data.shape)
probas = fluid.layers.data(name='p', shape=probas_shape, dtype='float32')
labels = fluid.layers.data(name='l', shape=labels_shape, dtype='int32')
param_attr = fluid.ParamAttr(name='conv2d.weight', initializer=fluid.initializer.ConstantInitializer(value=2.0))
y_predict = fluid.layers.conv2d(input=probas, num_filters=num_class, filter_size=2, param_attr=param_attr)
y_predict = paddle.reshape(y_predict, shape=[-1,1])
y_predict = paddle.cumsum(y_predict, axis=axis)
labels = fluid.layers.cast(labels, dtype='float32')
cost = fluid.layers.square_error_cost(input=y_predict, label=labels)
loss = fluid.layers.mean(cost)
if backward:
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.1)
sgd_optimizer.minimize(loss)
use_cuda = True
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
main_program = fluid.default_main_program()
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
for i in range(5):
exe.run(
main_program,
fetch_list= [],
feed={'p': probas_data, 'l': labels_data, 'm': ignore_data},
return_numpy=True)
with profiler.profiler('GPU', 'total', '/tmp/profile') as prof:
for i in range(num_epochs):
exe.run(
main_program,
fetch_list= [],
feed={'p': probas_data, 'l': labels_data, 'm': ignore_data},
return_numpy=True)
# test_speed(num_epochs=300)
test_speed(num_epochs=300, backward=True) |
9068bc7
to
340c592
Compare
@@ -1543,3 +1542,73 @@ def kron(x, y, name=None): | |||
out = helper.create_variable_for_type_inference(dtype=x.dtype) | |||
helper.append_op(type="kron", inputs={"X": x, "Y": y}, outputs={"Out": out}) | |||
return out | |||
|
|||
|
|||
def cumsum(x, axis=None, dtype=None, name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在fluid.layers.cumsum函数添加@deprecated装饰器,指明跟新API的关系。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已在新的PR中添加@deprecated装饰器 #26104
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
def cumsum(x, axis=None, dtype=None, name=None): | ||
""" | ||
:alias_main: paddle.cumsum | ||
:alias: paddle.cumsum,paddle.tensor.cumsum,paddle.tensor.math.cumsum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alias都不用写,可以删除这两行
PR types
Function optimization, Performance optimization
PR changes
OPs
Describe
1 optimize forward performance
2 optimize backward performance
3 add parameters "dtype", "name"
4 if axis=None, flatten the input
5 full negative indexing for the 'axis' parameter is supported