Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

运行 paddle_simnet.py 出现 RuntimeError: boost::bad_get: failed value get using boost::get #6

Closed
Miopas opened this issue Jul 16, 2018 · 1 comment

Comments

@Miopas
Copy link

Miopas commented Jul 16, 2018

Paddle 版本信息:docker 安装的 paddlepaddle-gpu==0.14.0

AnyQ/tools/simnet/train/paddle 目录下运行 sh run_train.sh,出现错误:

Traceback (most recent call last):
  File "paddle_simnet.py", line 178, in <module>
    train(conf_dict)
  File "paddle_simnet.py", line 91, in train
    main_program=fluid.default_main_program())
  File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/parallel_executor.py", line 155, in __init__
    build_strategy, num_trainers, trainer_id)
RuntimeError: boost::bad_get: failed value get using boost::get

Solution:
因为错误发生在 fluid.ParallelExecutor 函数,我预计是因为 docker 的 Paddle 是单 GPU 运行的。然后修改把 ParallelExecutor 相关的代码改成 Executor,如下:

    ## Get and run executor
    #parallel_executor = fluid.ParallelExecutor(
    #    use_cuda=False, loss_name=avg_cost.name,
    #    main_program=fluid.default_main_program())
    ## Get device number
    #device_count = parallel_executor.device_count
    #logging.info("device count: %d" % device_count)
    # run train
    logging.info("start train process ...")
    for epoch_id in range(conf_dict["epoch_num"]):
        losses = []
        # Get batch data iterator
        batch_data = paddle.batch(reader, conf_dict["batch_size"], drop_last=False)
        start_time = time.time()
        for iter, data in enumerate(batch_data()):
            #if len(data) < device_count:
            #    continue
            #avg_loss = parallel_executor.run(
            #    [avg_cost.name], feed=feeder.feed(data))
            avg_loss = executor.run(
                fetch_list=[avg_cost.name], feed=feeder.feed(data))
            print("epoch: %d, iter: %d, loss: %f" %
                (epoch_id, iter, np.mean(avg_loss[0])))
            losses.append(np.mean(avg_loss[0]))
        end_time = time.time()

然后运行没问题了。供参考。

@oyjxer
Copy link
Collaborator

oyjxer commented Jul 16, 2018

您好,多谢指出,当前只支持在CPU环境下的训练,后续考虑逐步完善GPU训练部分。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants