运行 paddle_simnet.py 出现 RuntimeError: boost::bad_get: failed value get using boost::get #6

Miopas · 2018-07-16T08:04:15Z

Paddle 版本信息：docker 安装的 paddlepaddle-gpu==0.14.0

在 AnyQ/tools/simnet/train/paddle 目录下运行 sh run_train.sh，出现错误：

Traceback (most recent call last):
  File "paddle_simnet.py", line 178, in <module>
    train(conf_dict)
  File "paddle_simnet.py", line 91, in train
    main_program=fluid.default_main_program())
  File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/parallel_executor.py", line 155, in __init__
    build_strategy, num_trainers, trainer_id)
RuntimeError: boost::bad_get: failed value get using boost::get

Solution：
因为错误发生在 fluid.ParallelExecutor 函数，我预计是因为 docker 的 Paddle 是单 GPU 运行的。然后修改把 ParallelExecutor 相关的代码改成 Executor，如下：

    ## Get and run executor
    #parallel_executor = fluid.ParallelExecutor(
    #    use_cuda=False, loss_name=avg_cost.name,
    #    main_program=fluid.default_main_program())
    ## Get device number
    #device_count = parallel_executor.device_count
    #logging.info("device count: %d" % device_count)
    # run train
    logging.info("start train process ...")
    for epoch_id in range(conf_dict["epoch_num"]):
        losses = []
        # Get batch data iterator
        batch_data = paddle.batch(reader, conf_dict["batch_size"], drop_last=False)
        start_time = time.time()
        for iter, data in enumerate(batch_data()):
            #if len(data) < device_count:
            #    continue
            #avg_loss = parallel_executor.run(
            #    [avg_cost.name], feed=feeder.feed(data))
            avg_loss = executor.run(
                fetch_list=[avg_cost.name], feed=feeder.feed(data))
            print("epoch: %d, iter: %d, loss: %f" %
                (epoch_id, iter, np.mean(avg_loss[0])))
            losses.append(np.mean(avg_loss[0]))
        end_time = time.time()

然后运行没问题了。供参考。

The text was updated successfully, but these errors were encountered:

oyjxer · 2018-07-16T08:47:47Z

您好，多谢指出，当前只支持在CPU环境下的训练，后续考虑逐步完善GPU训练部分。

zhanghan1992 closed this as completed Jul 16, 2018

YiLing28 mentioned this issue Oct 23, 2018

protobuf版本问题导致的不能编译 #10

Closed

tx-anin mentioned this issue May 22, 2019

系统多并发时内存溢出 #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

运行 paddle_simnet.py 出现 RuntimeError: boost::bad_get: failed value get using boost::get #6

运行 paddle_simnet.py 出现 RuntimeError: boost::bad_get: failed value get using boost::get #6

Miopas commented Jul 16, 2018

oyjxer commented Jul 16, 2018 •

edited by yinweichong

运行 paddle_simnet.py 出现 RuntimeError: boost::bad_get: failed value get using boost::get #6

运行 paddle_simnet.py 出现 RuntimeError: boost::bad_get: failed value get using boost::get #6

Comments

Miopas commented Jul 16, 2018

oyjxer commented Jul 16, 2018 • edited by yinweichong

oyjxer commented Jul 16, 2018 •

edited by yinweichong