Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paddle训练使用多cpu不如单cpu速度快 #923

Closed
janelu9 opened this issue Dec 16, 2016 · 8 comments
Closed

paddle训练使用多cpu不如单cpu速度快 #923

janelu9 opened this issue Dec 16, 2016 · 8 comments
Assignees
Labels

Comments

@janelu9
Copy link

janelu9 commented Dec 16, 2016

在服务器上建立了1,2,4,8个cpu的镜像,当trainner_counter分别设置为1,2,4,8时发现速度逐渐变慢,全部设置为1时,速度相当。说明paddle并没有利用多cpu啊

@janelu9
Copy link
Author

janelu9 commented Dec 16, 2016

版本是0.9.0a0
with_avx: ON
with_gpu: OFF
with_double: OFF
with_python: ON
with_rdma: OFF
with_glog: ON
with_gflags: ON
with_metric_learning:
with_timer: OFF

@reyoung
Copy link
Collaborator

reyoung commented Dec 16, 2016

@janelu9 最可能的原因是batch_size设置的过小,导致计算线程大量空闲。

同时,读数据的DataProvider可能写的太慢,导致时间占用都在读数据上。

@reyoung reyoung self-assigned this Dec 16, 2016
@backyes
Copy link
Contributor

backyes commented Dec 16, 2016

@janelu9

  • 比较快速做一些排除分析,比如可以加大batch size,排除是否是mini-batch很小的原因。

  • 另外如果有兴趣深入分析原因, 也可以从源码编译Paddle, 并使能WITH_TIMER, 可以获取更加量化的分析。

@backyes
Copy link
Contributor

backyes commented Dec 16, 2016

@janelu9

在服务器上建立了1,2,4,8个cpu的镜像

另外,不知道这些是否都是物理核个数

@janelu9
Copy link
Author

janelu9 commented Dec 20, 2016

@backyes 物理核心有2个 逻辑48个 256G内存 suse12系统

@janelu9
Copy link
Author

janelu9 commented Dec 21, 2016

加大batch_size等于成倍减少训练次数 肯定训练的时间会缩短了 但是精度会下降

@reyoung
Copy link
Collaborator

reyoung commented Dec 21, 2016

duplicated #957 不过这里很多想法是错的。。比如加大batch_size不一定成比例增加训练次数。

到issue #957 讨论吧

@reyoung reyoung closed this as completed Dec 21, 2016
@janelu9
Copy link
Author

janelu9 commented Dec 21, 2016

@reyoung 额 不是训练次数 那是迭代次数了 不过每次迭代的计算量不一样了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants