Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练/预测 kie_unet_sdmgr出现 OSError: (External) CUDA error(700), an illegal memory access was encountered. #6533

Closed
ChenNima opened this issue Jun 9, 2022 · 2 comments
Assignees

Comments

@ChenNima
Copy link

ChenNima commented Jun 9, 2022

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment: Windows10, GTX3070, CUDA version 11.2, cuDNN Version: 8.2.
  • 版本号/Version:Paddle:paddlepaddle-gpu==2.3.0 cudatoolkit=11.2 PaddleOCR:release/2.5 问题相关组件/Related components:kie_unet_sdmgr
  • 运行指令/Command Code:python tools/train.py -c configs/kie/kie_unet_sdmgr.yml -o Global.save_model_dir=./output/kie/
  • 完整报错/Complete Error Message:
Traceback (most recent call last):
  File "E:\code\PaddleOCR\tools\train.py", line 191, in <module>
    main(config, device, logger, vdl_writer)
  File "E:\code\PaddleOCR\tools\train.py", line 164, in main
    program.train(config, train_dataloader, valid_dataloader, device, model,
  File "E:\code\PaddleOCR\tools\program.py", line 264, in train
    preds = model(batch)
  File "C:\Users\felix\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "C:\Users\felix\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "E:\code\PaddleOCR\ppocr\modeling\architectures\base_model.py", line 85, in forward
    x = self.head(x, targets=data)
  File "C:\Users\felix\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "C:\Users\felix\anaconda3\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "E:\code\PaddleOCR\ppocr\modeling\heads\kie_sdmgr_head.py", line 88, in forward
    nodes = paddle.scatter(nodes, valid.squeeze(1), t)
  File "C:\Users\felix\anaconda3\lib\site-packages\paddle\tensor\manipulation.py", line 1587, in scatter
    return _C_ops.scatter(x, index, updates, 'overwrite', overwrite)
OSError: (External) CUDA error(700), an illegal memory access was encountered.
  [Hint: 'cudaErrorIllegalAddress'. The device encountered a load or store instruction on an invalid memory address. This leaves the process in an inconsistentstate and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched. ] (at ..\paddle\phi\backends\gpu\cuda\cuda_info.cc:251)
  [operator < scatter > error]

paddlepaddle-gpu 以及cuda,cuDNN均正确安装:

Running verify PaddlePaddle program ...
W0609 14:51:24.791992 12200 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.2
W0609 14:51:24.803002 12200 gpu_context.cc:306] device: 0, cuDNN Version: 8.2.
PaddlePaddle works well on 1 GPU.
PaddlePaddle works well on 1 GPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

使用paddleOcr的其他模块可正常在GPU上训练/预测。
该问题仅发生在GPU训练/预测,使用CPU均可正常工作。数据集使用默认的wildreceipt

@LDOUBLEV
Copy link
Collaborator

LDOUBLEV commented Jun 9, 2022

OSError: (External) CUDA error(700), an illegal memory access was encountered.

显存不够用

@ChenNima
Copy link
Author

ChenNima commented Jun 9, 2022

OSError: (External) CUDA error(700), an illegal memory access was encountered.

显存不够用

更新:使用同样硬件情况下在Windows内开启wsl2,在cuda toolkit 11.7 + cuDNN 8.4的情况下可以成功工作。。看起来是某种环境问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants