Run inference on single GPU #32

kamalasubha · 2022-08-28T11:58:35Z

Hi,
I am able to do all setup as per instructions given in README
In the evaluation step,

python -m torch.distributed.launch --nproc_per_node=4 tools/test.py --cfg config/lidar_rcnn.yaml --checkpoint outputs/lidar_rcnn/checkpoint_lidar_rcnn_59.pth.tar
python tools/create_results.py --cfg config/lidar_rcnn.yaml

I am facing the following questions while running the evaluation.

How to change the command to run a single GPU, nproc_per_node needs to be 1.
What should be MODEL.Frame number for checkpoint_lidar_rcnn_59.pth.tar?
Since I am trying to understand the evaluation, kindly help me on this to fix.

The text was updated successfully, but these errors were encountered:

Lzc6996 · 2022-08-31T01:14:41Z

@kamalasubha

Note that, you should keep the nGPUS in config equal to nproc_per_node ， in your case, set both of them 1.
checkpoint_lidar_rcnn_59.pth.tar is trained by frame = 1

kamalasubha · 2022-08-31T02:03:27Z

@Lzc6996
Thanks for the inputs
I am facing following error with above config,

Traceback (most recent call last):
  File "tools/test.py", line 91, in <module>
    test(cfg, 0, valloader, model, device, cfg.TEST.TAT_PATH)
  File "/home/lidar/LiDAR_RCNN/src/LiDAR_RCNN/core/function.py", line 113, in test
    for idx, batch in enumerate(testloader):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 28, in fetch
    data.append(next(self.dataset_iter))
  File "/home/lidar/LiDAR_RCNN/src/LiDAR_RCNN/datasets/waymo/loader.py", line 87, in transform_test
    pcd_cur, pcd_pre, proposal, gt_box, gt_cls = load_data(it, self.frame)
  File "/home/lidar/LiDAR_RCNN/src/LiDAR_RCNN/datasets/waymo/data_utils.py", line 65, in load_data
    pcd_cur_ri1 = np.hstack([pcd_cur_ri1, pcd_add_ones_ri1])
  File "<__array_function__ internals>", line 6, in hstack
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/shape_base.py", line 344, in hstack
    return _nx.concatenate(arrs, 0)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s)

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 263, in <module>
    main()
  File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 259, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '-u', 'tools/test.py', '--local_rank=0', '--cfg', 'config/lidar_rcnn.yaml', '--checkpoint', '/home/lidar/models/checkpoint_lidar_rcnn_59.pth.tar']' returned non-zero exit status 1.

I used val tfrecords from allpreprocessor.zip that shared via mail. Any clue on this?

Lzc6996 · 2022-08-31T02:08:41Z

@kamalasubha
I guess this because we update the code for multi-frame, but the val.tfrecord is generated by previous code.
Can you generate data by yourself following data_processer ?

kamalasubha · 2022-08-31T02:18:51Z

Ok @Lzc6996. I will generate val tf record based on the link given. But, can you please explain which part of the code needs to be changed to support a single frame? Does any separate demo exist for the same?

Lzc6996 · 2022-08-31T02:27:25Z

@kamalasubha
I get another plan, you can use the old release version of our code. v0.1.1
Actually, I can't figure out why the two versions are incompatible without running it by myself.

kamalasubha · 2022-08-31T03:11:40Z

Thanks @Lzc6996 . I will look into it

kamalasubha · 2022-08-31T06:13:12Z

@Lzc6996 I am able to run with the older version. Thanks for the inputs

kamalasubha closed this as completed Aug 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run inference on single GPU #32

Run inference on single GPU #32

kamalasubha commented Aug 28, 2022

Lzc6996 commented Aug 31, 2022

kamalasubha commented Aug 31, 2022

Lzc6996 commented Aug 31, 2022

kamalasubha commented Aug 31, 2022

Lzc6996 commented Aug 31, 2022

kamalasubha commented Aug 31, 2022

kamalasubha commented Aug 31, 2022

Run inference on single GPU #32

Run inference on single GPU #32

Comments

kamalasubha commented Aug 28, 2022

Lzc6996 commented Aug 31, 2022

kamalasubha commented Aug 31, 2022

Lzc6996 commented Aug 31, 2022

kamalasubha commented Aug 31, 2022

Lzc6996 commented Aug 31, 2022

kamalasubha commented Aug 31, 2022

kamalasubha commented Aug 31, 2022