Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension issue on RAFT #74

Closed
ckcraig01 opened this issue Jan 18, 2022 · 5 comments · Fixed by #83
Closed

Dimension issue on RAFT #74

ckcraig01 opened this issue Jan 18, 2022 · 5 comments · Fixed by #83

Comments

@ckcraig01
Copy link

ckcraig01 commented Jan 18, 2022

Thanks for your great work!

Test input script:

python tools/test.py configs/raft/raft_8x2_100k_flyingchairs_368x496.py  /ckpt/raft_8x2_100k_flyingchairs.pth --eval EPE

The error message is as follow:

[ ] 0/640, elapsed: 0s, ETA:Traceback (most recent call last):
File "tools/test.py", line 178, in
main()
File "tools/test.py", line 171, in main
f'In {dataset_name} '
File "root/framework/optical_flow/mmflow/mmflow/core/evaluation/evaluation.py", line 38, in online_evaluation
model, data_loader, metric=metric, **kwargs)
File "root/framework/optical_flow/mmflow/mmflow/core/evaluation/evaluation.py", line 68, in single_gpu_online_evaluation
batch_results = model(test_mode=True, **data)
File "root/miniconda3/envs/mmflow/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "root/miniconda3/envs/mmflow/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 48, in forward
return self.module(*inputs[0], **kwargs[0])
File "root/miniconda3/envs/mmflow/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "root/framework/optical_flow/mmflow/mmflow/models/flow_estimators/base.py", line 61, in forward
return self.forward_test(*args, **kwargs)
File "root/framework/optical_flow/mmflow/mmflow/models/flow_estimators/raft.py", line 145, in forward_test
feat1, feat2, h_feat, cxt_feat = self.extract_feat(imgs)
File "root/framework/optical_flow/mmflow/mmflow/models/flow_estimators/raft.py", line 72, in extract_feat
feat1 = self.encoder(img1)
File "root/miniconda3/envs/mmflow/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "root/framework/optical_flow/mmflow/mmflow/models/encoders/raft_encoder.py", line 293, in forward
x = self.conv1(x)
File "root/miniconda3/envs/mmflow/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "root/miniconda3/envs/mmflow/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 443, in forward
return self._conv_forward(input, self.weight, self.bias)
File "root/miniconda3/envs/mmflow/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 440, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 5-dimensional input of size [1, 1, 6, 384, 512] instead

May you help me on this?

@ckcraig01
Copy link
Author

with pytorch 1.9 and mmcv 1.4.3

@MeowZheng
Copy link
Collaborator

MeowZheng commented Jan 18, 2022

I try the same command line

python tools/test.py configs/raft/raft_8x2_100k_flyingchairs_368x496.py  /ckpt/raft_8x2_100k_flyingchairs.pth --eval EPE

but didn't meet the issue.

May you add breakpoints at the following line, and check the input image shape before extract_feat, and the valid shape is N, 6, H, W .
File "root/framework/optical_flow/mmflow/mmflow/models/flow_estimators/raft.py", line 145, in forward_test
feat1, feat2, h_feat, cxt_feat = self.extract_feat(imgs)

@ckcraig01
Copy link
Author

ckcraig01 commented Jan 19, 2022

@MeowZheng Thank you for the promptly feedback

I have added to print image shape based on your suggestion
print(imgs.shape):
Result: torch.Size([1, 1, 6, 384, 512])

I have added to print
print(data['imgs']) at
https://github.com/open-mmlab/mmflow/blob/master/mmflow/apis/test.py#L47

May you help to let me know where I shall continue to investigate? Thanks.

DataContainer([tensor([[[[-0.8353, -0.8588, -0.8667, ..., -0.2863, -0.2863, -0.2784],
[-0.8980, -0.8824, -0.8431, ..., -0.2863, -0.2863, -0.2863],
[-0.8510, -0.8824, -0.8745, ..., -0.2392, -0.2549, -0.2549],
...,
[-0.9686, -0.9608, -0.9529, ..., -0.8902, -0.8824, -0.8510],
[-0.9451, -0.9608, -0.9686, ..., -0.8902, -0.8745, -0.8588],
[-0.9294, -0.9608, -0.9765, ..., -0.8980, -0.8824, -0.8745]],

     [[-0.8667, -0.8902, -0.8980,  ..., -0.1294, -0.1294, -0.1294],
      [-0.9059, -0.8902, -0.8510,  ..., -0.1294, -0.1294, -0.1294],
      [-0.8588, -0.8902, -0.8824,  ..., -0.0745, -0.0980, -0.1059],
      ...,
      [-0.9608, -0.9529, -0.9529,  ..., -0.9216, -0.9373, -0.9216],
      [-0.9373, -0.9529, -0.9686,  ..., -0.9216, -0.9451, -0.9294],
      [-0.9216, -0.9529, -0.9765,  ..., -0.9294, -0.9529, -0.9451]],

     [[-0.8275, -0.8510, -0.8588,  ..., -0.0353, -0.0275, -0.0275],
      [-0.8902, -0.8745, -0.8196,  ..., -0.0353, -0.0353, -0.0275],
      [-0.8275, -0.8588, -0.8510,  ...,  0.0275, -0.0039,  0.0039],
      ...,
      [-0.9765, -0.9686, -0.9529,  ..., -0.8824, -0.9137, -0.8980],
      [-0.9529, -0.9686, -0.9686,  ..., -0.8824, -0.9216, -0.9059],
      [-0.9373, -0.9686, -0.9765,  ..., -0.8902, -0.9294, -0.9216]],

     [[-0.8353, -0.8588, -0.8667,  ..., -0.2706, -0.2706, -0.2627],
      [-0.8980, -0.8824, -0.8431,  ..., -0.2706, -0.2706, -0.2627],
      [-0.8510, -0.8824, -0.8745,  ..., -0.2784, -0.2784, -0.2706],
      ...,
      [-0.9686, -0.9608, -0.9529,  ..., -0.8902, -0.8824, -0.8510],
      [-0.9451, -0.9608, -0.9686,  ..., -0.8902, -0.8745, -0.8588],
      [-0.9294, -0.9608, -0.9765,  ..., -0.8980, -0.8824, -0.8745]],

     [[-0.8667, -0.8902, -0.8980,  ..., -0.1216, -0.1137, -0.1059],
      [-0.9059, -0.8902, -0.8510,  ..., -0.1216, -0.1137, -0.1059],
      [-0.8588, -0.8902, -0.8824,  ..., -0.1216, -0.1216, -0.1137],
      ...,
      [-0.9608, -0.9529, -0.9529,  ..., -0.9216, -0.9373, -0.9216],
      [-0.9373, -0.9529, -0.9686,  ..., -0.9216, -0.9451, -0.9294],
      [-0.9216, -0.9529, -0.9765,  ..., -0.9294, -0.9529, -0.9451]],

     [[-0.8275, -0.8510, -0.8588,  ..., -0.0196, -0.0196, -0.0118],
      [-0.8902, -0.8745, -0.8196,  ..., -0.0196, -0.0196, -0.0118],
      [-0.8275, -0.8588, -0.8510,  ..., -0.0196, -0.0196, -0.0118],
      ...,
      [-0.9765, -0.9686, -0.9529,  ..., -0.8824, -0.9137, -0.8980],
      [-0.9529, -0.9686, -0.9686,  ..., -0.8824, -0.9216, -0.9059],
      [-0.9373, -0.9686, -0.9765,  ..., -0.8902, -0.9294, -0.9216]]]])])

@ckcraig01
Copy link
Author

ckcraig01 commented Jan 19, 2022

Hi, solved!

I have installed a CPU-version of pytorch and cause this issue:

Have resolved with following installation flow

- conda create -n mmflow python=3.8 conda -y
- conda activate mmflow
- conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
- pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
- pip install -r requirements/build.txt
- pip install -v -e . 

But I found another issue
(1)

write_flow(osp.join(out_dir, f'flow_{i:03d}.flo'), r)

(2)
def write_flow(flow: np.ndarray, flow_file: str) -> None:

The order seem to be reverse? Thanks

@MeowZheng
Copy link
Collaborator

Many thanks for you finding the bug of write_flow. Community contributions are more than welcome in OpenMMLab repos and we hope to cooperate deeply with the community, so we very much appreciate it if you create a PR to fix the bug. Here is a tutorial for creating PR https://github.com/open-mmlab/mmcv/blob/master/docs/en/community/pr.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants