paddle.randn输入float类型导致内存溢出问题 #35669

294coder · 2021-09-12T08:10:10Z

paddle version: 2.1.2

import paddle paddle.randn((2*16, 640*640/16, 128))
第二个参数会是浮点数，会报以下错误：
SystemError: (Fatal) Operator gaussian_random raises an struct paddle::memory::allocation::BadAlloc exception.
The exception content is
:ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 400.000244MB memory on GPU 0, 2.361126GB memory has been allocated and available memory is only 3.638874GB.
Please check whether there is any other process using GPU 0.

If yes, please stop them, or start PaddlePaddle on another GPU.
If no, please decrease the batch size of your model.
(at C:\home\workspace\Paddle_release3\paddle\fluid\memory\allocation\cuda_allocator.cc:79)
. (at C:\home\workspace\Paddle_release3\paddle\fluid\imperative\tracer.cc:192)

放在cpu上依然不会抛出类型错误，只会出现bad alloc

The text was updated successfully, but these errors were encountered:

paddle-bot-old · 2021-09-12T08:10:12Z

您好，我们已经收到了您的问题，会安排技术人员尽快解答您的问题，请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时，您也可以通过查看官网API文档、常见问题、历史Issue、AI社区来寻求解答。祝您生活愉快～

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API，FAQ，Github Issue and AI community to get the answer.Have a nice day!

FrostML · 2021-09-13T03:45:30Z

可以试试自行编一个 develop 的版本，报错信息已经更新。

ValueError: (InvalidArgument) gaussian_random(): argument (position 1) must be list of int, but got float at pos 1

libertatis · 2021-09-13T03:57:36Z

import paddle 
tensor = paddle.randn((2*16, 640*640/16, 128))

函数 randn 的签名为：

paddle.randn(shape, dtype=None, name=None)

其中对 shape 参数的要求是：

shape (list|tuple|Tensor) - 生成的随机Tensor的形状。如果 shape 是list、tuple，则其中的元素可以是int，或者是形状为[1]且数据类型为int32、int64的Tensor。如果 shape 是Tensor，则是数据类型为int32、int64的1-D Tensor。

就是说要求 shape 中的元素为整型。但是当 shape 中的元素为 float 时，并不会出错，是会成功创建 tensor 的。这是因为在动态图模式下，即：

paddle.fluid.framework.in_dygraph_mode() == True

不会做任何 shape — check_shape(...) 和 dtype — check_dtype(...) 的检查。原因如下：

    # NOTE [ Why skip dynamic graph check ]:
    # 1. If the input type / dtype of a layer is wrong, it will be reported
    # directly on that line. User can easily print the relevant information
    # on which line. It is easier to debug, so there is no need to check
    # in dynamic graph mode.
    # 2. Performance considerations. Because these checks are executed at
    # each step in dynamic graph mode, it will bring a heavy performance burden.

在动态图模式下，randn 只是将 shape 强制转化为 list，然后调用 _C_ops 下的 gaussian_random：

    if in_dygraph_mode():
        shape = utils.convert_shape_to_list(shape)
        return _C_ops.gaussian_random('shape', shape, 'mean',
                                      float(mean), 'std',
                                      float(std), 'seed', seed, 'dtype', dtype)

在调用 convert_shape_to_list 时，如果 shape 是 list 或 tuple 的话，shape 只是统一变成了 list 并没有数据类型的转换，如果 shape 是一个 Tensor 的话，倒是显示的将数据类型转换成了 int，convert_shape_to_list 定义如下：

def convert_shape_to_list(shape):
    """
    Convert shape(list, tuple, variable) to list in imperative mode
    """
    if isinstance(shape, (list, tuple)):
        shape = list(
            map(lambda x: x.numpy()[0] if isinstance(x, Variable) else x,
                shape))
    else:
        shape = shape.numpy().astype(int).tolist()
    return shape

也就是说，这里的 randn 在调用 gaussian_random 时，shape 只是转换成了一个 list，里面元素的数据类型并没有变，即 [int, float, int]。当打印 tensor 的 shape 时，我们发现，shape 第二个元素向下取整变成了 int。也就是说，在动态图模式下，shape 的处理放在了 gaussian_random 函数里了。不过很遗憾，我没有在 _C_ops 模块里找到 gaussian_random 函数的定义和实现，可能它应该是一个 C++ 函数吧，我不太清楚 Python 是如何调用 C++ 函数的。可以的话，希望官方能告知 gaussian_random 的定义和实现在哪里，以及 _C_ops 是怎么调用 gaussian_random 的呢？

wanghuancoder · 2021-09-13T06:14:36Z

python调用_C_ops.xxx的代码是自动生成出来的，自动生成代码在如下文件中：
这个https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/op_function_generator.cc

如果你用develop自己编译，编译完成后可以找到一个文件：paddle/fluid/pybind/op_function_impl.h，gaussian_random定义在这个文件中。

libertatis · 2021-09-13T07:06:22Z

python调用_C_ops.xxx的代码是自动生成出来的，自动生成代码在如下文件中：
这个https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/op_function_generator.cc

如果你用develop自己编译，编译完成后可以找到一个文件：paddle/fluid/pybind/op_function_impl.h，gaussian_random定义在这个文件中。

好哒，谢谢啦！我试一下~

paddle-bot-old bot assigned FrostML Sep 12, 2021

FrostML mentioned this issue Apr 15, 2022

【PFCC】使用python调用api时对c++后端debug的问题 #41827

Closed

paddle-bot-old bot closed this as completed Sep 20, 2022

paddle-bot bot added the status/close 已关闭 label Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paddle.randn输入float类型导致内存溢出问题 #35669

paddle.randn输入float类型导致内存溢出问题 #35669

294coder commented Sep 12, 2021

paddle-bot-old bot commented Sep 12, 2021

FrostML commented Sep 13, 2021

libertatis commented Sep 13, 2021

wanghuancoder commented Sep 13, 2021

libertatis commented Sep 13, 2021

paddle.randn输入float类型导致内存溢出问题 #35669

paddle.randn输入float类型导致内存溢出问题 #35669

Comments

294coder commented Sep 12, 2021

paddle-bot-old bot commented Sep 12, 2021

FrostML commented Sep 13, 2021

libertatis commented Sep 13, 2021

wanghuancoder commented Sep 13, 2021

libertatis commented Sep 13, 2021