Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOC problem #6

Closed
DavidKong96 opened this issue Dec 8, 2020 · 14 comments
Closed

SOC problem #6

DavidKong96 opened this issue Dec 8, 2020 · 14 comments

Comments

@DavidKong96
Copy link

DavidKong96 commented Dec 8, 2020

Thanks for your sharing.Nice work~
here is a question about the SOC in your paper.
The self-supervied stage is used in the new domain datasets, so the new or the target datasets are which we will test later?

And another question is when i try to train MODNet, the prediction of dp if just boundary which is not same as your paper .
35
6

@ZHKKKe
Copy link
Owner

ZHKKKe commented Dec 9, 2020

Hi, thanks for your attention!

For your questions:
Q1: so the new or the target datasets are which we will test later?
The new domain dataset should be split into a training subset S_t and a validation subset S_v. SOC self-supervised strategy finetunes the model in S_t and tests it in S_v.
For example, if a model is finetuned by the the data from our WebCam (S_t), users can test this model with the data captured by their own WebCam (S_v).

Q2: the prediction of dp if just boundary which is not same as your paper
It is difficult for me to point out the problem only through these two images. Can you share one of your training samples (including I, alpha_g, and m_d in the paper Fig.2) with me? I can try to help you find the problem. Thanks.

@DavidKong96
Copy link
Author

Hi, thanks for your attention!

For your questions:
Q1: so the new or the target datasets are which we will test later?
The new domain dataset should be split into a training subset S_t and a validation subset S_v. SOC self-supervised strategy finetunes the model in S_t and tests it in S_v.
For example, if a model is finetuned by the the data from our WebCam (S_t), users can test this model with the data captured by their own WebCam (S_v).

Q2: the prediction of dp if just boundary which is not same as your paper
It is difficult for me to point out the problem only through these two images. Can you share one of your training samples (including I, alpha_g, and m_d in the paper Fig.2) with me? I can try to help you find the problem. Thanks.

thanks a lot for your reply. i have found the reson of Q2. maybe my kernel size of dilate/erode is too small , make the wrong result.could you please tell me the size of your dilate/erode kernel?
and another question is when we try self-supervised stage on video, it's still based on single image? is there any relation about time imformation?
thank your very much!

@ZHKKKe
Copy link
Owner

ZHKKKe commented Dec 9, 2020

Q1: could you please tell me the size of your dilate/erode kernel?
I use scipy.ndimage.grey_dilation and scipy.ndimage.grey_erosion for dilation and erosion.
For an image with short side of 512:

  • In supervised training stage, m_d is generated by random kernel size in (5, 10).
  • In SOC self-supervised stage, m_d is generated by kernel size of 30.

In fact, m_d in paper Figure 2 is a good example, please set your parameters according to it.

Q2: it's still based on single image? is there any relation about time imformation?
It is still based on single image. In this work, we did not consider temporal information.

@DavidKong96
Copy link
Author

thank you very much. it helps me a lot.

@ZHKKKe ZHKKKe closed this as completed Dec 11, 2020
@TsykunovDmitriy
Copy link

TsykunovDmitriy commented Dec 23, 2020

Hello! Thanks for your interesting work.
Trying to reproduce the SOC training. Predicts start converging to pred_detail pretty quickly. This behavior seems to be quite logical. Have you met with this? If so, how did you deal with it?

before
after

@ZHKKKe
Copy link
Owner

ZHKKKe commented Dec 24, 2020

@TsykunovDmitriy
Hi, thanks for your attention!
Based on your results, I think you forgot to use \tilde{m}_d (Eq. 7,8 in our paper) to apply the detail consistency losses (Eq.8 and the second term in Eq.7) only on the boundary regions. Please check it.

@TsykunovDmitriy
Copy link

TsykunovDmitriy commented Dec 24, 2020

Thanks for the answer. I wrote below the pseudocode for the implementation of equations 7, 8 which I use in my training pipeline. Tell me where I'm wrong.

pred_semantic, pred_detail, pred_matte = model(image)
pred_semantic_fz, pred_detail_fz, pred_matte_fz = model_freeze(image)

de_mask = get_dilate_erode_mask(pred_matte.numpy())
seg_mask = get_segmentation_mask(pred_matte.numpy())

# equation 7
Ls = 0.5*( sqrt([pred_semantic - seg_mask]**2) ).mean()
Ld = (abs(pred_detail - pred_matte)* de_mask).sum() / de_mask.sum() # the same in training
Lcons = Ls + Ld

# equation 8
Ldd = (abs(pred_detail - pred_detail_fz)* de_mask).sum() / de_mask.sum()

loss = Lcons + Ldd
mode.update_weights(loss)

If possible, then answer a few more questions. How much data did you use for fine-tuning? How many epochs? What does it mean "simultaneously" in paper?

@ZHKKKe
Copy link
Owner

ZHKKKe commented Dec 24, 2020

Q1: How much data did you use for fine-tuning?
We use 400 video clips consists of 50k frames.

Q2: How many epochs?
About 10 epochs.

Q3: What does it mean "simultaneously" in paper?
"simultaneously" means loss = Lcons + Ldd. You are correct.

However, you sould split Ls into two terms as:

Ls = 0.5*( sqrt([pred_semantic - seg_mask.detach()]**2) ).mean() + 0.5*( sqrt([pred_semantic.detach() - seg_mask]**2) ).mean()

The gradients should go back from both branches at the same time (You should make sure the gradient can be back propagated through seg_mask).
Of course, you also need to do the same thing for Ld.

Besides, could you visualize a sample with de_mask and seg_mask? They are important for finding the problem.

@ZHKKKe ZHKKKe reopened this Dec 24, 2020
@TsykunovDmitriy
Copy link

Thanks a lot for the answers.
I thought G(*) was a non-differentiable function. I will conduct a few experiments and if the problem is not solved, I will share additional visualizations.

@TsykunovDmitriy
Copy link

TsykunovDmitriy commented Dec 25, 2020

I did some experiments. Unfortunately, the result has not improved. pred_matte predictions still converge pretty quickly to pred_detail. I have visualize some examples of de_mask and seg_mask.

Снимок экрана 2020-12-25 в 15 31 49

Снимок экрана 2020-12-25 в 15 31 27

I have a guess that I have incorrectly formulated equation 5 from the paper. Could you please comment Ld from here?

@ZHKKKe
Copy link
Owner

ZHKKKe commented Dec 25, 2020

@TsykunovDmitriy
I think your de_mask and seg_mask are correct. I cannot find the problem from your code snippet.

Maybe you can try the following code for calculating Ld:

Ld = ( abs(pred_detail - pred_detail_fz.detach()) * de_mask + abs(pred_matte - pred_matte_fz.detach()) * de_mask  ).sum() / de_mask.sum()

In this way, you do not need the Ldd any more, i.e., loss = Lcons.

This is our old implementation. It can also work well in our case.

@TsykunovDmitriy
Copy link

Thanks for your advice! Unfortunately, I did not receive satisfactory results. I think this is due to the small amount of data at the stage of training the model. Perhaps my model is not generalized enough. I think you can close the discussion.

@ZHKKKe
Copy link
Owner

ZHKKKe commented Jan 13, 2021

@TsykunovDmitriy
Hi. Our main training code will be released in these two weeks. I hope it can help you after it is released.

@ZHKKKe
Copy link
Owner

ZHKKKe commented Jan 28, 2021

Hi, all,

Our main code for SOC adaptation is available now。

@ZHKKKe ZHKKKe closed this as completed Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants