Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems of optical flow results when finetuning on real scene? #19

Open
poincarelee opened this issue Mar 30, 2022 · 5 comments
Open

Comments

@poincarelee
Copy link

Hi,
have you tried training on real scene such as market or subway? I have finetuned the model according to IRR-PWC by using your './checkpoints/fastflownet_ft_mix.pth' in subway real scene, but the results are much worse than flownet2's.
And I met another weird problem: during predicting, whether I multiply the optical flow result by div_flow(20), there seems no difference on the flow-png(flow result transferred to png).
flownet2_1857
fastflow_train_1857_347

@ltkong218
Copy link
Owner

Do you use ground truth flow label of your real scene to train FastFlowNet in a supervised manner or in an unsupervised way? For optical flow visualization, the scale factor will be normalized in current code, so there is no difference whether you multiply the optical flow result by div_flow(20), you can modify the code to meet you need.

@poincarelee
Copy link
Author

hi,
I use flownet2's predict results as groundtruth, since in real scene groundtruth for optical flow couldn't be obtained.
As for div_flow, yes, you are right.
Any other tricks I missed during training fastflownet ? I completely refer to IRR-PWCNet.
Have you already applied fastflownet to real scene pictures? I think fastflownet's performance would be much better than results I got.

@ltkong218
Copy link
Owner

I think taking FlowNet2's prediction as ground truth will lead to error accumulation, I suggest you try to adopt RAFT's prediction as ground truth label. For training FastFlowNet, you should normalize input images to [0, 1] and reduce the one channel mean value, also the flow ground truth should be divided by div_flow(20). Data augmentation like geometry and color augmentation should be adopted for better generalization.

It is good to pretrain FastFlowNet in a self-supervised manner which does not have domain gap, I will release the training code if my under reviewed paper is published.

@poincarelee
Copy link
Author

Ok, thanks for your timely reply.

  1. good advice, I will try RAFT's prediction ASAP.
  2. flow ground truth divided by 20 has already been taken into the code;
    I will try the normalization. 'reduce the one channel mean value' means minus rgb channels mean value? I don't understand this.
    geometry and color augmentation has been adopted.

3)haven't tried self-supervised manner in real-scene dataset, I will try this.

@plateau1
Copy link

Could you please exchange the details of training FastFlowNet?

  1. The normalization operation involved in demo.py has been used.

  2. Gt has also been divided by 20.

  3. Training the learning rate in half at a fixed stage

The problem we are facing now is that we used CHAIRS for training and MPI-SIntel as the verification set, and found that EPE has always been 13.5px, and the loss is very small, only 0.3.

I checked the gradient, did the return, and also tried to adjust the learning rate, found no effect. @poincarelee

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants