Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paddle.randn输入float类型导致内存溢出问题 #35669

Closed
294coder opened this issue Sep 12, 2021 · 5 comments
Closed

paddle.randn输入float类型导致内存溢出问题 #35669

294coder opened this issue Sep 12, 2021 · 5 comments
Assignees
Labels
status/close 已关闭

Comments

@294coder
Copy link

paddle version: 2.1.2

import paddle paddle.randn((2*16, 640*640/16, 128))
第二个参数会是浮点数,会报以下错误:
SystemError: (Fatal) Operator gaussian_random raises an struct paddle::memory::allocation::BadAlloc exception.
The exception content is
:ResourceExhaustedError:
Out of memory error on GPU 0. Cannot allocate 400.000244MB memory on GPU 0, 2.361126GB memory has been allocated and available memory is only 3.638874GB.
Please check whether there is any other process using GPU 0.

  1. If yes, please stop them, or start PaddlePaddle on another GPU.
  2. If no, please decrease the batch size of your model.
    (at C:\home\workspace\Paddle_release3\paddle\fluid\memory\allocation\cuda_allocator.cc:79)
    . (at C:\home\workspace\Paddle_release3\paddle\fluid\imperative\tracer.cc:192)

放在cpu上依然不会抛出类型错误,只会出现bad alloc

@paddle-bot-old
Copy link

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

@FrostML
Copy link
Contributor

FrostML commented Sep 13, 2021

可以试试自行编一个 develop 的版本,报错信息已经更新。

ValueError: (InvalidArgument) gaussian_random(): argument (position 1) must be list of int, but got float at pos 1

@libertatis
Copy link

import paddle 
tensor = paddle.randn((2*16, 640*640/16, 128))

函数 randn 的签名为:

paddle.randn(shape, dtype=None, name=None)

其中对 shape 参数的要求是:

shape (list|tuple|Tensor) - 生成的随机Tensor的形状。如果 shape 是list、tuple,则其中的元素可以是int,或者是形状为[1]且数据类型为int32、int64的Tensor。如果 shape 是Tensor,则是数据类型为int32、int64的1-D Tensor。

就是说要求 shape 中的元素为整型。但是当 shape 中的元素为 float 时,并不会出错,是会成功创建 tensor 的。这是因为在动态图模式下,即:

paddle.fluid.framework.in_dygraph_mode() == True

不会做任何 shapecheck_shape(...)dtypecheck_dtype(...) 的检查。原因如下:

    # NOTE [ Why skip dynamic graph check ]:
    # 1. If the input type / dtype of a layer is wrong, it will be reported
    # directly on that line. User can easily print the relevant information
    # on which line. It is easier to debug, so there is no need to check
    # in dynamic graph mode.
    # 2. Performance considerations. Because these checks are executed at
    # each step in dynamic graph mode, it will bring a heavy performance burden.

在动态图模式下,randn 只是将 shape 强制转化为 list,然后调用 _C_ops 下的 gaussian_random

    if in_dygraph_mode():
        shape = utils.convert_shape_to_list(shape)
        return _C_ops.gaussian_random('shape', shape, 'mean',
                                      float(mean), 'std',
                                      float(std), 'seed', seed, 'dtype', dtype)

在调用 convert_shape_to_list 时,如果 shapelisttuple 的话,shape 只是统一变成了 list 并没有数据类型的转换,如果 shape 是一个 Tensor 的话,倒是显示的将数据类型转换成了 intconvert_shape_to_list 定义如下:

def convert_shape_to_list(shape):
    """
    Convert shape(list, tuple, variable) to list in imperative mode
    """
    if isinstance(shape, (list, tuple)):
        shape = list(
            map(lambda x: x.numpy()[0] if isinstance(x, Variable) else x,
                shape))
    else:
        shape = shape.numpy().astype(int).tolist()
    return shape

也就是说,这里的 randn 在调用 gaussian_random 时,shape 只是转换成了一个 list,里面元素的数据类型并没有变,即 [int, float, int]。当打印 tensorshape 时,我们发现,shape 第二个元素向下取整变成了 int。也就是说,在动态图模式下,shape 的处理放在了 gaussian_random 函数里了。不过很遗憾,我没有在 _C_ops 模块里找到 gaussian_random 函数的定义和实现,可能它应该是一个 C++ 函数吧,我不太清楚 Python 是如何调用 C++ 函数的。可以的话,希望官方能告知 gaussian_random 的定义和实现在哪里,以及 _C_ops 是怎么调用 gaussian_random 的呢?

@wanghuancoder
Copy link
Contributor

python调用_C_ops.xxx的代码是自动生成出来的,自动生成代码在如下文件中:
这个https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/op_function_generator.cc

如果你用develop自己编译,编译完成后可以找到一个文件:paddle/fluid/pybind/op_function_impl.h,gaussian_random定义在这个文件中。

@libertatis
Copy link

python调用_C_ops.xxx的代码是自动生成出来的,自动生成代码在如下文件中:
这个https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/op_function_generator.cc

如果你用develop自己编译,编译完成后可以找到一个文件:paddle/fluid/pybind/op_function_impl.h,gaussian_random定义在这个文件中。

好哒,谢谢啦!我试一下~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/close 已关闭
Projects
None yet
Development

No branches or pull requests

4 participants