在云上的GPU机器跑paddle0.10.0的V1的api #2946

WoNiuHu · 2017-07-18T12:01:29Z

hi , RT. 在云上的机器跑V1的API, 机器不支持，报错：paddle command not found.
之后提了一个paddle 群里面问了下，建议下载nvidia-docker 和paddle:gpu-release-v0.9.0的镜像，然后cpu版本的是可以跑，跑gpu的时候报错如下


I0718 02:20:33.320343 36 Util.cpp:155] commandline: /usr/local/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --save_dir=./model_output --job=train --use_gpu=true --trainer_count=1 --num_passes=2 --log_period=10 --dot_period=20 --show_parameter_stats_period=100 --test_all_data_in_one_period=1

I0718 02:22:57.979029 36 Util.cpp:130] Calling runInitFunctions

I0718 02:22:57.980270 36 Util.cpp:143] Call runInitFunctions done.

[INFO 2017-07-18 02:22:59,474 networks.py:1466] The input order is [word, label]
[INFO 2017-07-18 02:22:59,474 networks.py:1472] The output order is [cost_0]
I0718 02:22:59.617970 36 Trainer.cpp:170] trainer mode: Normal

F0718 02:22:59.624567 36 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function
*** Check failure stack trace: ***`

然后IDL的同学建议在0.10.0的paddle环境下跑，因为支持v1版本的api，v1的脚本要通过paddle二进制执行？这个地方怎么安装可以使得v1的api在GPU的机器下跑的起来。

The text was updated successfully, but these errors were encountered:

helinwang · 2017-07-18T21:13:51Z

@WoNiuHu 您好，不好意思，Paddle 0.10.0支持的是V2 API，并不向后支持v1 API。
以下是我的尝试，确认了这个结论：

v1_api_demo git:(2885) docker run -it -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=OFF" paddlepaddle/paddle:0.10.0rc2 bash

root@411cb1beef42:/paddle/mnist# ./train.sh 
I0718 21:08:32.183030   298 Util.cpp:160] commandline: /usr/bin/../opt/paddle/bin/paddle_trainer --config=vgg_16_mnist.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=0 --trainer_count=1 --num_passes=100 --save_dir=./mnist_vgg_model 
F0718 21:08:32.298312   298 PythonUtil.cpp:186] Check failed: (module) != nullptr Current PYTHONPATH: ['/usr/opt/paddle/bin', '/paddle/mnist', '/usr/lib/python27.zip', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages']
Python Error: <type 'exceptions.ImportError'> : No module named paddle.trainer.config_parser
Python Callstack: 
Import paddle.trainer.config_parserError
*** Check failure stack trace: ***
    @           0x8a1d3c  google::LogMessage::Fail()
    @           0x8a1c83  google::LogMessage::SendToLog()
    @           0x8a15f8  google::LogMessage::Flush()
    @           0x8a47ed  google::LogMessageFatal::~LogMessageFatal()
    @           0x7ff47b  paddle::py::import()
    @           0x7ff4ee  paddle::callPythonFuncRetPyObj()
    @           0x7ff8bc  paddle::callPythonFunc()
    @           0x729553  paddle::TrainerConfigHelper::TrainerConfigHelper()
    @           0x729b94  paddle::TrainerConfigHelper::createFromFlags()
    @           0x594932  main
    @     0x7f36f1818b45  __libc_start_main
    @           0x5a2149  (unknown)
    @              (nil)  (unknown)
/usr/bin/paddle: line 109:   298 Aborted                 ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

~~我会帮您创建一个v0.9.0 CUDA 8的docker image。需要一点时间，这块我不是很熟悉，可能需要问问其他的开发者。~~

helinwang · 2017-07-18T21:16:24Z

~~0.10.0支持的是V2 API，并不向后支持v1 API，我先关闭这个issue，您的问题我们在#2931 讨论吧。~~

typhoonzero · 2017-07-19T06:09:14Z

@helinwang 这个是paddlepaddle/paddle:0.10.0rc2这个docker image的一个bug，在rc3中已经修复，或者直接使用paddlepaddle/paddle:0.10.0是release版本。

helinwang · 2017-07-19T19:07:05Z

明白了，经测试paddlepaddle/paddle:0.10.0确实支持V1 API.

helinwang · 2017-07-19T19:09:54Z

@WoNiuHu 我这里测试可以找到paddle：

➜  v1_api_demo git:(2885) ✗ docker run -it paddlepaddle/paddle:0.10.0 bash        
root@482cf1f3cb15:/# paddle
usage: paddle [--help] [<args>]
These are common paddle commands used in various situations:
    train             Start a paddle_trainer
    merge_model       Start a paddle_merge_model
    pserver           Start a paddle_pserver_main
    version           Print paddle version
    dump_config       Dump the trainer config as proto string
    make_diagram      Make Diagram using Graphviz

'paddle train --help' 'paddle merge_model --help', 'paddle pserver --help', list more detailed usage of each command

WoNiuHu · 2017-07-20T01:47:17Z

@helinwang 这个是镜像paddlepaddle/paddle:0.10.0支持GPU的吗？

WoNiuHu · 2017-07-20T02:02:04Z

@typhoonzero hi，可以提供一个能跑v1版本api的gpu镜像版本吗？

Yancey1989 · 2017-07-20T02:13:57Z

@WoNiuHu 可以用 paddlepaddle/paddle:0.10.0-gpu , PaddlePaddle的镜像在https://hub.docker.com/r/paddlepaddle/paddle/tags/ 可以看到。

typhoonzero · 2017-07-20T02:14:25Z

楼上正解～

WoNiuHu · 2017-07-20T02:16:28Z

@Yancey1989 这个确定可以跑v1版本api的GPU版本么？因为之前下的0.9.0的是支持CPU，但是GPU的时候报错

Yancey1989 · 2017-07-20T02:42:27Z

0.9.0的GPU报错的原因已经在#2931 (comment) 给出了解释，0.9.0-gpu的cuda版本可能过低，所以还请尝试下用cuda8编译的0.10.0-gpu 版本镜像。

lcy-seso · 2017-08-19T12:00:43Z

I close this issue due to inactivity. please feel free to reopen it if more information is available.

WoNiuHu changed the title ~~在云上的GPU机器跑V1的api~~ 在云上的GPU机器跑0.10.0的V1的api Jul 18, 2017

WoNiuHu changed the title ~~在云上的GPU机器跑0.10.0的V1的api~~ 在云上的GPU机器跑paddle0.10.0的V1的api Jul 18, 2017

helinwang mentioned this issue Jul 18, 2017

在云上的机器跑gpu的版本的报错 #2931

Closed

helinwang closed this as completed Jul 18, 2017

helinwang reopened this Jul 19, 2017

lcy-seso closed this as completed Aug 19, 2017

heavengate pushed a commit to heavengate/Paddle that referenced this issue Aug 16, 2021

fix url address, test=document_fix (PaddlePaddle#2946)

945bafa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

在云上的GPU机器跑paddle0.10.0的V1的api #2946

在云上的GPU机器跑paddle0.10.0的V1的api #2946

WoNiuHu commented Jul 18, 2017

helinwang commented Jul 18, 2017 •

edited

Loading

helinwang commented Jul 18, 2017 •

edited

Loading

typhoonzero commented Jul 19, 2017

helinwang commented Jul 19, 2017

helinwang commented Jul 19, 2017 •

edited

Loading

WoNiuHu commented Jul 20, 2017

WoNiuHu commented Jul 20, 2017

Yancey1989 commented Jul 20, 2017

typhoonzero commented Jul 20, 2017

WoNiuHu commented Jul 20, 2017

Yancey1989 commented Jul 20, 2017

lcy-seso commented Aug 19, 2017

在云上的GPU机器跑paddle0.10.0的V1的api #2946

在云上的GPU机器跑paddle0.10.0的V1的api #2946

Comments

WoNiuHu commented Jul 18, 2017

helinwang commented Jul 18, 2017 • edited Loading

helinwang commented Jul 18, 2017 • edited Loading

typhoonzero commented Jul 19, 2017

helinwang commented Jul 19, 2017

helinwang commented Jul 19, 2017 • edited Loading

WoNiuHu commented Jul 20, 2017

WoNiuHu commented Jul 20, 2017

Yancey1989 commented Jul 20, 2017

typhoonzero commented Jul 20, 2017

WoNiuHu commented Jul 20, 2017

Yancey1989 commented Jul 20, 2017

lcy-seso commented Aug 19, 2017

helinwang commented Jul 18, 2017 •

edited

Loading

helinwang commented Jul 18, 2017 •

edited

Loading

helinwang commented Jul 19, 2017 •

edited

Loading